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CHAPTER  I 
The  Problem 

Educators  and  psychologists  often  employ  statistical  formulas 
in  the  solution  of  their  problems  without  noting  the  assumptions 
on  which  these  formulas  are  based.  This  situation  is  particularly 
true  with  regard  to  the  application  of  the  methods  of  correla- 
tional analysis  to  data  obtained  from  mental  and  educational 
tests.  The  theory  of  correlation,  as  developed  principally  by  the 
work  of  the  English  Biometric  School,  was  intended  to  apply  to 
measurable  physical  characters.  In  order  to  apply  it  to  test 
scores,  the  investigator  must  satisfy  himself  that  these  scores 
possess  certain  of  the  attributes  of  such  measurements.  Assump- 
tions so  obvious  as  to  need  no  explicit  statement  when  applied  to 
measurements,  such  as  equality  of  units  at  different  parts  of  a 
scale,  and  independence  of  observational  errors  and  values  of 
the  variable  measured,  require  careful  examination  and  verifica- 
tion when  applied  to  test  scores. 

The  first  problem  of  the  present  investigation  is  to  examine 
into  the  assumptions  which  underlie  the  principal  formulas  of 
correlational  psychology.  Some  of  these  assumptions  will  be 
found  to  be  fundamental  necessary  conditions  for  the  applica- 
bility of  the  formulas.  In  such  cases  it  is  necessary  to  point  out 
the  limitations  in  methods  of  test  construction  and  application 
implied,  and  this  is  the  second  problem.  Some,  on  the  other 
hand,  which  have  been  employed  extensively  in  the  past,  will  be 
found  to  be  unnecessary  under  certain  experimental  conditions. 
The  third  problem,  then,  is  to  discover  these  conditions  wherever 
possible,  and  to  derive  modified  formulas  applicable  under  them. 

The  consistency  of  data  with  assumptions  can  often  be  checked 
by  statistical  tests  applied  to  the  data.  The  devising  of  such 
checks  is  a  fourth  problem.  Finally,  if  formulas  are  to  be  of  any 
considerable  usefulness,  their  standard  errors  should  be  known. 
The  derivation  of  these  standard  errors  for  the  principal  for- 
mulas of  correlational  psychology  is  the  fifth  and  last  problem  to 
be  dealt  with  in  this  study. 


CHAPTER  II 

RELIABILITY 

Notation.  A  number  of  mental  traits  are  to  be  measured  by 
fallible  tests.  Each  test  will  be  supposed  to  consist  of  two,  three, 
or  more  forms.  A,  B,  C,  etc.  The  values  of  the  underlying  traits, 
considered  as  though  measured  without  error,  will  be  designated 
Xo,,  X„,  X^,  X„  etc.  The  subscripts  of  the  corresponding 
scores  on  Form  A  will  be  Arabic  nimierals,  on  Form  B,  small 
Roman  numerals,  and  on  Form  C,  large  Roman  numerals.  The 
letter  M  will  designate  a  mean,  x  a  score  taken  as  a  deviation 
from  the  mean,  so  that  x  =  X  — M,  and  a  a  standard  deviation. 
We  then  have  the  following  system  of  notation : 


True  Value 
of  Trait 

Form  A 
score 

Form  B 
score 

Form  C 
score 

Form  A 
mean 

Form  A 
standard  dev 

A.  CO 

Xi 

Xi 

Xi 

Mi 

Ol 

x„ 

x^ 

Xu 

x„ 

M2 

<r2 

X. 

X3 

Xiii 

Xiii 

M3 

<r3 

X, 

X4 

Xiv 

Xiv 

M4 

<^4 

etc. 

This  system  of  notation  uses  only  a  single  subscript  to  designate 
any  of  the  quantities  involved,  and  it  may  readily  be  extended 
to  any  number  of  variables. 

Sources  of  error  in  mental  tests.  When  tests  are  used  in 
an  attempt  to  secure  an  estimate  of  the  magnitude  of  some  mental 
trait  in  an  individual,  there  are  at  least  four  sources  of  error. 
First,  the  test  items  may  sample  systematically  some  trait  other 
than  the  one  they  are  designed  to  measure.  Second,  the  test 
items  may  fail  systematically  to  sample  some  important  aspect 
of  the  trait  they  are  designed  to  measure.  These  first  two  sources 
of  error  are  the  essential  problems  of  validity,  and  they  need  not 
concern  us  at  present.  We  shall  assume  that  the  mental  trait 
measured  by  the  several  forms  oj  a  test  is  whatever  mental  ability  or 
combination  of  abilities  causes  simultaneous  variation  in  the  scores 
on  these  forms.* 

•  A  discussion  of  this  and  other  assumptions  regarding  the  fundamental  defi- 
nition of  reliability  is  given  by  Kelley  (1924). 
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The  third  source  of  error  resides  in  the  fact  that  the  given  test 
situation  is  only  a  single  sample  of  all  possible  situations  calling 
for  the  exercise  of  the  test  ability.  The  individual  changes  from 
hour  to  hour  and  from  day  to  day  in  a  manner  that  may  affect 
his  test  performance  without  changing  his  underlying  ability. 
It  might  at  first  seem  possible  to  estimate  the  magnitude  of  this 
individual  variability — or  rather  its  average  magnitude  in  a 
group — by  applying  the  same  test  at  different  times  and  noting 
the  discrepancies  between  successive  scores.  The  only  reason- 
able way  to  do  this,  in  order  to  eliminate  the  disturbing  effects  of 
memory  of  the  previous  responses  to  specific  test  items,  would  be 
to  wait  until  these  responses  had  been  forgotten,  and  the  time 
for  this  would  normally  be  so  great  that  it  would  no  longer  be 
safe  to  assume  that  there  had  been  no  significant  change  in  the 
underlying  trait  during  the  interval. 

The  fourth  source  of  error  lies  in  the  fact  that  any  single  test 
includes  only  a  limited  sample  of  the  possible  number  of  items 
measuring  the  trait  under  consideration.  The  magnitude  of  this 
error  may  be  estimated  by  comparing  the  scores  on  random 
halves  of  the  test,  i.e.,  by  giving  the  items  of  Forms  A  and  B 
simultaneously.  Errors  of  this  fourth  type  will  be  called  test 
errors  hereafter,  and  errors  of  the  third  type,  response  errors. 
The  two  sets  of  errors  taken  together  constitute  the  errors  of 
measurement,  when  we  are  considering  only  the  reliability,  and 
not  the  validity  of  the  test. 

Fundamental  factor  pattern.  A  factor  pattern  is  an  equa- 
tion or  set  of  equations  expressing  the  definitions  and  assump- 
tions regarding  the  make-up  of  test  scores  in  terms  of  abilities 
and  errors  in  any  given  hypothetical  situation.  For  the  case  of 
tests  measuring  the  same  fundamental  ability,  we  may  think  of 
this  ability  as  unitary,  whence  we  have, 

where  Ai  is  the  error  of  measurement  in  Xi.  If  Xi  is  related  to 
Xco  and  Ai  in  an  approximately  linear  manner,  or  if  the  variation 
in  the  values  of  Xi  over  the  group  in  which  it  is  measured  is  small 
in  comparison  to  the  values  themselves,  no  matter  what  the  form 
of  the  function,  we  may  write,  to  a  close  approximation,  trans- 
ferring origins  to  the  means, 

Xi=CiXco+5i. 
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The  value  c,  is  a  constant  designating  the  ratio  of  the  units  in 
which  X,  is  measured  to  the  units  in  which  Xo,  is  measured.  We 
may  assume  without  loss  of  generality  that  5i  is  measured  in  the 
units  of  Xi.  Then,  considering  two  forms  of  the  test,  we  may 
write, 

X,  =c,x„+6„         Xi  =  CiX„+5i.  (1) 

Fundamental  factor  pattern  for  two  tests  that  measure  the 
same  ability. 

The  factors  c,  and  Ci  are  constants  descriptive  of  the  units  in 
which  Xa,  is  measured  in  the  two  forms  of  the  test,  and  5i  and  5j 
are  the  respective  errors  of  measurement.  If  we  assume  that 
these  errors  of  measurement  are  uncorrelated  with  each  other 
and  with  the  true  abilities  of  the  subjects  in  any  given  group,  as 
is  implied  in  our  definition  of  the  trait  measured,  we  may  write, 


and, 


flri=Ci<ToD+<ra  ,  ffj  =Cjcr„+(Ta,  (^) 


2XiXi/N  =  2(CiX.  +5,)  (CiX„  +6i)/N  =  CiCicri.  (3) 


Definition  of  reliability.     Since  cjai  is  less  than  c]  by  an 

amount  equal  to  ai ,  we  may  define  the  reliability  of  the  test  as 

1 

the  ratio  of  the  true  variance  (squared  standard  deviation)  to  the 
obtained  variance.  This  definition,  it  should  be  noted,  is  quite 
general.  It  implies  only  the  previous  definition  of  the  mental 
trait  measured.  It  does  not  assume  that  the  errors  of  measure- 
ment will  sum  to  zero,  nor  that  the  several  forms  of  the  test  are 
equally  variable  or  reliable.  As  a  pure  definition,  it  does  not  in 
fact  involve  any  other  form  of  the  test  than  the  one  under  con- 
sideration. It  assumes  only  that  in  the  group  taken  as  the  sam- 
ple, there  is  no  correlation  between  the  magnitude  of  the  trait 
and  that  of  the  error  of  measurement.  Calling  the  reliability 
coefficient  so  defined  Rj,  we  have, 

Rl=C?^i/a^  (4) 

Definition  of  the  reliability  coefficient  of  one  form  of  a  test. 

Statistical  estimation  of  reliability.  The  value  of  a]  may 
be  obtained  directly  from  the  data.  The  value  of  Cicri  is  unknown, 
but  it  may  be  estimated  by  introducing  certain  additional  assump- 
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tions.  If  we  let  SXiXj/N  =  p,i  (covariance  of  x,  and  Xj*),  we  obtain 
at  once  from  (3), 

CiCi<ri  =  p,i.  (5) 

To  find  the  value  of  Cjai,  we  must  evidently  multiply  this  by  the 
ratio  Ci/Cj. 

It  is  ordinarily  impossible  to  determine  with  accuracy  how  the 
units  of  measurement  of  one  form  of  a  test  compare  with  those  of 
another  form.  In  a  fairly  large  sample  we  might  assume  that  an 
error  of  measurement  is  equally  likely  to  be  positive  or  negative, 
so  that  the  sum  of  all  such  errors  in  the  sample  would  approach 
zero.    Then,  since, 

Mi=CiM„+Ma  ,  and  Mi  =  CiM„+MA, 

1  i 

Ci/Ci  =  Mi/Mi  + (terms  of  the  order  of  M^  /M;  and 

1 

terms  of  higher  orders,  all  of  which  are  negli- 
gible in  comparison  with  Mi/Mi  if  M^  and 
M^  are  close  to  zero).  ^ 

i 

This  assumption,  however,  is  quite  dangerous.  It  is  not  unlikely 
that  either  form  of  the  test  will  contain  unique  non-chance  ele- 
ments, which  in  spite  of  being  non-vanishing  in  the  group  are 
properly  to  be  classed  with  the  errors  of  measurement  because 
of  their  irrelevance.  This  point  has  been  more  fully  discussed 
by  Kelley  (1924).  Furthermore,  the  values  of  Mi  and  Mi  in  ratio 
comparisons  of  this  sort  must  be  measured  from  true  zero-points 
of  the  underlying  abilities,  a  condition  at  best  only  approximated 
by  a  very  few  mental  and  educational  tests.  And  this  very 
approximation  is  based  on  a  further  assumption.  In  the  scaling 
of  a  test,  the  standard  deviation  is  ordinarily  taken  as  the  unit 
of  measurement.  But  it  is  obvious  from  (2)  that  the  magnitude 
of  the  standard  deviation  depends  in  part  upon  the  size  of  the 
errors  of  measurement.  Hence,  a  mean,  expressed  in  terms  of 
some  multiple  of  the  corresponding  standard  deviation  as  a  unit, 
cannot  be  compared  with  another  mean  so  expressed,  except  on 
the  assumption  that  the  errors  of  measurement  are  proportional 
to  the  units,  i.e.,  that  the  fundamental  reliabilities  of  the  two 
forms  of  the  test  are  equal.  For  example,  suppose  that  the  basic 
units  of  measiirement  are  equal  in  the  two  forms.    Then  in  (2), 


*  The  covariance  is  the  first  product-moment  coefficient  of  two  sets  of 
observations. 
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c]ai=cfal,  but  al  will  not  equal  <t^  unless  a-]  =(y\   also.     The 

1  i 

form  of  the  test  having  the  greater  error  of  measurement  will 
have  the  greater  standard  deviation,  and  as  a  consequence  its 
mean,  measured  in  standard  units,  will  be  lower  than  that  of  the 
more  reliable  form.  An  important  corollary  of  this  fact  may  be 
stated  as  follows:  Standard  scores  are  comparable  measures  only  in 
case  the  measurements  so  compared  are  of  equal  reliability. 

There  is  one  method  of  estimating  the  ratio  Ci/Ci,  however,  that 
is  based  entirely  on  averages.  This  method  assumes  that  the  dif- 
ference between  the  means  of  two  groups  will  be  the  same  which- 
ever of  two  fallible  tests  is  used  to  determine  this  difference. 
These  means  might  be  successive  age  or  grade  averages  taken 
from  the  test  norms.  If  we  choose  some  age  or  grade  range  such 
that  from  1/6  to  1/20  of  the  experimental  group  falls  outside  it  at 
either  end,  the  mean  score  increments  on  the  two  forms  of  the  test 
corresponding  to  the  given  age  or  grade  increment  will  be  approx- 
imately equivalent,  and  their  ratio  will  be  equal  to  Ci/Cj. 

If  our  sample  is  large,  we  may  obtain  the  ratio  Ci/Ci  without 
recourse  to  norms  based  on  other  groups,  and  so  avoid  the  assump- 
tion, implicit  in  the  procedure  of  the  previous  paragraph,  that 
norms  are  available  for  the  two  forms  of  the  test,  obtained  from 
comparable  groups.  Giving  each  individual  in  the  experimental 
group  a  total  score  equal  to  the  sum  of  his  scores  on  the  two  forms, 
we  may  secure  the  sub-groups  by  taking  the  lowest  and  highest 
quarters  of  the  original  groups*.  The  ratio  of  the  differences  be- 
tween the  means  of  the  sub-groups  on  the  two  tests  may  then  be 
taken  as  equal  to  Ci/c;  for  the  given  total  group.  When  we  know 
this  ratio,  we  may  substitute  at  once  in  (4),  and  we  obtain, 

Ri  =c\ail<x\  =  (CiCi(7i/(ri)(c,/Ci), 

and  from  (5), 

Ri=PHCi/ff^Ci  =  biiCi/Ci.  (6) 

Reliability  of  Form  A. 

Similarly, 

Ri  =  PliCi/afCi=biiCi/Ci.  (7) 

Reliability  of  Form  B. 

•  strictly  speaking,  we  should  use  the  highest  and  lowest  27  per  cent,  as 
Kelley  has  proved  that  in  this  case  the  ratio  of  the  difference  between  the 
means  to  its  standard  error  will  be  a  maximum,  the  distribution  of  scores  in 
the  total  group  being  normal.    See  Jensen  (1928),  p.  361. 
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If  Ci  =  Ci,  these  become  simply, 

Ri=bi,  (8) 

Ri=bn  (9) 

Reliabilities  of  the  two  forms  of  a  test  when  the  units  of 
measurement  are  equal. 

If  Ci/<7i  =Ci/(Ti,  we  obtain  from  (4)  and  (5), 

Ri=Ri=rii  (10) 

Reliabilities  of  equally  reliable  tests. 

The  value  of  Va  is  the  ordinary  reliability  coefficient.  This  has 
often  been  supposed  to  apply  properly  only  to  comparable  tests 
— tests  measuring  the  same  ability  in  the  same  units  with  the 
same  error,  and  whose  standard  deviations  are  therefore  equal. 
It  is  seen  here  to  have  much  wider  usefulness  than  this,  being 
applicable  wherever  we  have  two  equally  reliable  tests  of  the  same 
ability,  no  matter  what  the  units  of  measurement  may  be  in 
either  case.    This  fact  was  pointed  out  by  Kelley  (1924). 

Interpretation  of  the  reliability  coefficient.  It  is  of  in- 
terest to  note  that  the  reliability  coefficient  is  not  only  the  ratio 
of  the  variance  of  the  true  scores  to  that  of  the  obtained  scores, 
but  also  the  square  of  the  correlation  between  the  obtained 
scores  and  the  corresponding  true  scores,  if  this  is  linear.    For, 

ri„  =  2x„(ciXco +5i) /N(T„(T, 
But 

Ri  =Ciaa,/o'i, 

So  that 

Ri=rL.  (11) 

And 

ri.=Rf.  (12) 

This  last  expression  has  been  termed  the  index  of  reliability*. 

Reliability  of  the  sum  of  the  scores  on  two  forms  of  a 
test.    If  two  forms  of  a  test  have  been  given,  we  may  determine 


*  Monroe  (1923),  p.  206,  gave  this  name  to  the  square  root  of  the  reliability 
coefficient,  ascribing  it  to  Kelley,  who  had  actually  used  it  in  a  paper,  but  with 
no  intention  of  coining  a  term.  Monroe  appears  to  have  been  the  first  to  do 
this.    See  Walker  (1929),  p.  117,  for  further  discussion  of  this  point. 
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the  reliability  of  the  total  score  obtained  by  adding  together  the 
scores  on  the  separate  forms.    For  since, 

X(i+i)  =  X,  +Xi  =  (Ci  +Ci)x„  4-5i  +5i, 

R(l+i)  =  (Ci  +Ci)Vi/(r(i4.i) 

=  (cJ(ri+C?(Tl+2CiCi<Tl)/(cr?  +  <rf+2pu). 
R(i+i)  =  [Pii(Ci/Ci+Ci/c,+2)]/[crH<rf+2pH].  (13) 

Reliability  of  Form  A  plus  Form  B. 

If  Ci=Ci, 

R(i+i)  =  4Pii/(<r?+<^i+2pii).  (14) 

Reliability  of  Form  A  plus  Form  B  when  the  units  of  meas- 
urement are  equal. 

If  in  addition  <T]  =  a],  we  obtain,  on  dividing  numerator  and 
denominator  by  aio-,, 

R(i+i)=2r,i/(l+rii).  (15) 

Reliability  of  Form  A  plus  Form  B  when  these  are  compar- 
able forms. 

This  last  equation  is  the  well-known  Spearman-Brown  formula 
for  the  reliability  of  a  test  twice  as  long  as  either  of  the  forms  used 
in  computing  the  original  reliability  coefficient.  It  applies  strictly 
only  to  comparable  tests,  but  Kelley  has  shown  (1924)  that  when- 
ever the  tests  are  of  approximately  equal  reliability,  and  the  units 
of  one  are  not  very  much  greater  than  those  of  the  other  (more 
than  twice  as  great,  say),  it  still  applies  to  a  very  close 
approximation. 

Reliability  more  accurately  determined  from  three 
forms.  If  three  forms  of  a  test  have  been  given  to  the  same 
group,  we  may  determine  the  reliability  of  any  one  of  them  with- 
out recourse  to  the  somewhat  dubious  methods  of  evaluating  the 
ratio  Ci/Ci  described  above.    For,  as  in  (1),  (2),  (3),  and  (5), 

Xi=CiXa,4-5i.      0"!  =CiO-a,+o'5  .  Pii  =  CiCia;o. 

Xi=CiX„+5i.       af  =C?(ri+<7-5.  Pii=CiCi<ri. 


I 


Xi=CiX„-f-5i.       ai=Ci<T^-\-(Ts  '  PiI=CiCia-oo. 

I 
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From  the  three  right-hand  expressions, 

PliPll/PiI=C?cri,  (16) 

and  from  (4), 

Ri  =C?<rl/ai  =PiiPii/Pii(rL 

or, 

Ri=riirii/rii.  (17) 

Reliabihty  of  Form  A  from  a  knowledge  of  three  tests  of  the 
same  abihty. 

Similarly, 

Ri=riirii/rii,  (18) 

and, 

Ri=r,ira/rH.  (19) 

The  square  roots  of  quantities  such  as  those  given  in  (17),  (18), 
and  (19)  have  been  given  by  Spearman  (1927),  Appendix  p.  xvi, 
as  the  correlations  between  the  respective  tests  and  g,  the  gen- 
eral factor  common  to  them,  when  the  theory  of  two  factors 
holds.  These  equations  express  this  relationship  for  the  special 
case  in  which  the  specific  factors  can  be  taken  entirely  as  errors  of 
measurement. 

For  the  reliability  of  the  total  score  obtained  by  adding  to- 
gether the  scores  on  the  three  forms,  we  have, 

R(l+i+l)  =  (Ci+Ci+Ci)aa,/o-(i+i+i) 

Rd+i+i)  =  (PiiPii/Pii+PiiPii/Pii+PiiPii/Pii 

+2pii+2pu+2pii)/((T?-f(rf  +  <rf 

+2pii+2pii+2pii).  (20) 

Reliability  of  Form  A  plus  Form  B  plus  Form  C. 

Reliability  of  the  sum  of  several  tests.  Formula  (20)  may 
be  generalized  to  give  the  reliability  of  the  sum  of  any  number  of 
tests  of  the  same  function.  We  shall  have  to  change  our  notation 
in  dealing  with  more  than  three  tests.  Calling  the  forms  X,,  X2, 
.  .  .Xn,  and  the  reliability  coefficient  R^  instead  of  R(n.2+....+n), 
we  have  for  n  forms, 

Rn=(Cl+C2+.  .  •  +Cn)^    0-1/(7(1 +2+... +n)- 

In  estimating  the  value  of  Cicri,  we  may  take  as  equally  valid  the 
values  Pi2Pi3/P23,   P12P14/P24,   .  •  •Pi(n-i)Pin/P(n-i)n,  and  the  final 
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estimate  will  be  the  average  of  all  the  values  so  obtained,  (n-1) 
(n_2)/2  in  number.  But  we  must  also  determine  cjai,  C3(Ta>,  .  .  . 
dai  by  the  same  process.  It  is  evident,  therefore,  that  the  numer- 
ator will  contain  all  the  possible  values  of  PjkPjq/Pkq,  where  j,  k, 
and  q  take  all  possible  combinations  of  values  from  1  to  n  except 
one  another.    We  obtain  therefore, 

R„  =  [2S'  (PikPip/Pkc)/ !  (n  - 1) (n  - 2)  1  +2Sp3k] 

-[v<,j4-2Sp3k].  (21) 

Reliability  of  the  sum  of  n  tests  of  the  same  function. 

Where  i:  is  a  summation  from  1  to  n, 

S  is  a  summation  from  1  to  n(n-l)/2,  and 
S'  is  a  summation  from  1  to  n(n-l)(n-2)/2. 

Criterion  of  equal  units  of  measurement.  If  the  several 
forms  of  the  test  are  measured  in  the  same  fundamental  units  but 
differ  in  their  reliabilities;  i.e.,  if  Cj  =C2  =  .  .  .  =  Cn,  but  ci,  0-2,  .  .  .cfn 
are  different,  then, 

22  2  2 

ClC20'a.  =CiC3(roc,  =.   .   .=CiCnO'oo=.   .   •  =C(n-l)Cnaco, 

whence, 

Pl2=Pl3='   •   •=Pln  =  -   •   •=P(n-l)n' 

This  may  be  stated,  If  several  tests  of  the  same  function  meas- 
ure that  function  in  the  same  basic  units,  the  covariances  of 
these  tests  will  all  be  equal,  except  by  chance.  We  may  treat 
the  forms  of  the  test  as  successive  samplings  of  the  ability 
of  the  group,  and  compare  the  difference  between  any  two  covari- 
ances with  the  standard  error  of  this  difference.  If  there  are  a 
considerable  number  of  forms,  more  than  6  or  7,  say,  we  may  com- 
pute the  standard  deviation  of  the  actual  distribution  of  covari- 
ances, and  compare  this  with  the  theoretical  standard  error  of  a 
random  covariance  of  the  mean  order  of  magnitude  of  those 
observed.  Formulas  for  such  comparisons  will  be  given  in  Chap- 
ter VI.  These  comparisons  permit  the  investigator  to  determine 
when  it  is  necessary  to  use  a  full  formula,  such  as  (20)  or  (21),  and 
when  it  is  reasonable  to  use  the  simplified  formulas  next  to  be 
considered. 

Special  cases  of  reliability  of  the  sum  of  several  tests. 

From  (21),  assuming  all  covariances  equal, 
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Rn  =  nV(Zcr?+n(n-l)p).  (22) 

Reliability  of  the  sum  of  n  tests  of  the  same  ability,  meas- 
ured in  the  same  units. 

As  before,  2  is  a  summation  from  1  to  n,  and  p  is  taken  here  as  the 
mean  covariance.  If  in  addition  a\  =  a\=  .  .  .  =al,  within  the 
standard  errors  of  their  differences, 

R„  =  np/(a2  +  (n-l)p).  (23) 

where  a^  is  the  mean  variance.  Then  on  dividing  numerator  and 
denominator  by  a^, 

R„  =  nr/(H-(n-l)r).  (24) 

Reliability  of  the  sum  of  n  comparable  tests. 

This  is  the  Spearman-Brown  formula,  used  to  estimate  the  relia- 
bility of  the  sum  of  a  number  of  comparable  tests,  when  all  the 
tests  have  actually  been  given  to  a  group.  The  value  of  r  should 
be  taken  as  the  average  intercorrelation.  The  computation  of 
this  value  has  been  discussed  by  Edgerton  and  Toops  (1928). 

Estimated  reliability  of  one  form  of  a  test  from  a  knowl- 
edge of  several.  For  this  purpose  we  have  as  a  generalization 
of  (17), 

Ri  =2S"(ri3r,k/r3J/(n-l)(n-2).  (25) 

Reliability  of  Form  A,  from  a  knowledge  of  n  tests  of  the 
same  ability. 

The  symbol  S"  represents  a  summation  from  1  to  (n  — l)(n  — 2)/2. 
If  Pi2=Pi3=  .  .  .  =Pin=  •  •  •=P(n-i)n>  SO  that  we  may  assume 
that  Ci  =02=  .  .  .  =Cn,  we  have  as  a  generalization  of  (8), 

Ri  =  2:bj,/(n-l)  (26) 

Reliability  of  Form  A  from  a  knowledge  of  n  tests  of  the  same 
ability  measured  in  the  same  units. 

The  case  in  which  we  have  comparable  tests  will  be  discussed 
later. 

Practical  implications  of  assumptions.  All  of  the  formulas 
for  reliability  rest  on  the  assumption  that  the  errors  of  measure- 
ment are  uncorrected.  The  mean  square  error,  and  its  comple- 
ment, the  reliability  coefficient,  are  estimated  from  the  ratio  of 
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the  true  variance  to  the  obtained  variance.  The  latter  will  always 
contain  both  the  response  error  and  the  test  error.  But  the  exper- 
imental conditions  must  be  relied  upon  to  insure  that  our  estimate 
of  the  true  variance  is  free  from  any  systematic  error  of  either 
sort. 

Consider  the  response  error.  The  several  forms  of  the  test  must 
be  given  at  such  intervals  that  in  general  no  important  aspect  of 
this  error  is  necessarily  present  at  successive  testings.  It  is  obvi- 
ous that  they  should  not  be  given  at  the  same  sitting.  They 
should  probably  be  given  at  least  one,  and  perhaps  several,  days 
apart.  But  as  the  time  between  successive  testings  is  lengthened, 
the  response  error  merges  gradually  with  progressive  changes  in 
the  underlying  ability  itself.  The  time  should  be  short  enough  to 
warrant  the  assumption  that  there  has  been  no  significant  growth 
or  learning  during  the  interval,  but  long  enough  to  insure  that 
there  are  no  important  elements  of  the  response  error  that  persist 
from  one  testing  to  the  next.  It  should  also  be  of  such  duration 
as  to  avoid  recurrent  response  errors.  Thus  the  forms  should 
probably  not  be  given  at  the  same  hour  of  the  day,  nor  perhaps 
even  on  the  same  day  of  the  week.  There  is  no  mathematical  cri- 
terion here,  but  the  one  who  uses  tests,  if  he  desires  to  estimate 
their  reliabilities,  must  keep  in  mind  the  nature  of  these  errors 
in  planning  the  appropriate  intervals  for  any  particular  testing 
program. 

The  practice  of  giving  a  single  test  to  a  group,  taking  the  odd 
and  even  questions  (or  any  other  combination)  as  two  forms,  com- 
puting their  correlation,  applying  the  Spearman-Brown  formula, 
and  assuming  that  the  resulting  coefficient  is  a  valid  estimate  of 
the  reliability  of  the  test,  is  in  error.  Most  if  not  all  of  the  response 
error  for  each  individual  will  be  common  to  both  "forms,"  and  the 
resulting  correlation  between  them  will  therefore  be  exaggerated. 
The  difference  between  this  coefficient  and  unity,  in  fact,  might 
well  be  defended  as  a  measure  of  the  unreliability  of  the  test  due 
to  test  errors  alone,  and  the  difference  between  it  and  one  obtained 
from  the  whole  test  and  another  comparable  form  given  after  a 
suitable  interval  might  be  taken  as  the  unreliability  due  to  re- 
sponse errors  alone.  Woodyard  (1926),  however,  presents  evi- 
dence that  the  response  error  is  small  as  compared  to  other  errors 
in  tests.    She  states  (p.  3), 

"The  most  general  conclusion  from  a  review  of  the  evidence  is 
that  the  time  factor  is  of  small  moment  in  causing  an  individual 
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to  vary  in  the  mental  work  he  produces  under  conditions  found  in 
the  administration  of  such  standard  tests  as  are  common  in  intel- 
ligence and  educational  testing.  In  practically  all  of  the  data 
examined,  whatever  correlation  is  obtained  for  the  short  time 
interval  is  changed  by  but  a  few  points  in  the  second  decimal  place 
when  the  time  interval  is  increased." 

When  the  test  material  is  of  such  a  nature  that  examination 
with  one  form  constitutes  significant  practice  for  the  subject, 
tending  to  raise  his  scores  on  subsequent  forms  appreciably,  we 
have  a  special  problem.  Any  correlation  between  ability  and 
improvability  will  introduce  correlated  response  errors.  This 
matter  has  been  discut^sed  by  several  writers,  notably  Spearman 
(1910),  Brown  (1910  exp.)  and  (1913),  and  Wilton  (1914).  The 
last-named  writer  finds  that  these  errors  can  be  eliminated  by 
using  an  odd  number  of  forms — at  least  five  are  necessary — given 
at  successive  equal  intervals.  He  takes  the  sum  of  the  scores  on 
the  even  forms  as  Form  A,  and  the  sum  of  the  scores  on  the  odd 
forms  as  Form  B,  multiplying  the  first  and  last  of  these  by  0.5. 

The  "halo"  error  in  scoring  subjective  tests,  comparing  speci- 
mens of  work  with  quality  scales,  and  making  trait  ratings,  may 
be  treated  as  a  part  of  the  response  error  in  estimating  reliability. 
The  reliability  of  an  essay  test  can  be  determined  by  giving  some 
of  the  questions  on  one  day  and  some  on  another,  and  having  a 
different  person  mark  each  set.  The  two  sets  then  become  the 
two  forms  of  the  test.  In  the  case  of  the  quality  scale,  the  subject 
should  submit  two  specimens  of  his  work  produced  on  different 
occasions.  These  should  then  be  compared  with  the  scale  by 
different  people.  Trait  ratings  should  be  obtained  from  raters 
whose  contacts  with  the  person  rated  have  been  largely  at  different 
times,  if  this  is  possible.  In  the  last  two  cases,  there  will  still 
remain  the  scale  error — the  error  resulting  from  the  fact  that  both 
raters  use  the  same  specific  set  of  scaled  specimens  or  rating  de- 
vices. No  author  to  date  has  apparently  thought  it  necessary  to 
provide  duplicate  forms  for  a  quality  scale  or  rating  scale,  and 
no  one  yet  knows  to  what  extent  two  scales  can  differ  in  external 
appearance  and  still  remain  equally  valid.  Finally,  there  is  the 
error  involved  in  assuming  that  the  raters  are  equally  competent. 
The  reliability  of  a  rating  depends  upon  the  reliability  of  the 
rater  as  well  as  upon  those  of  the  person  rated  and  the  scale  used. 

The  test  error  is  a  matter  that  concerns  chiefly  the  author  of 
the  test.    If  it  is  to  be  uncorrected  with  the  true  ability,  the  test 
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must  be  equally  reliable  throughout  its  useful  range,  that  is  to 
say,  the  test  errors  of  those  who  make  the  higher  scores  must  be 
neither  greater  nor  less,  on  the  whole,  than  the  errors  of  those  who 
make  the  lower  scores.  This  implies  that  if  the  questions  be 
arranged  in  order  from  easy  to  hard,  the  increments  of  difficulty 
must  in  general  be  equal,  or  at  least  that  these  increments  must 
not  exhibit  any  progressive  change.  For  if,  say,  the  increments 
of  difficulty  are  greater  in  the  upper  ranges  of  the  test  than  in  the 
lower,  the  test  errors  will  for  that  reason  be  higher  for  those  who 
make  high  scores  than  for  those  who  make  lower  scores.  Further- 
more, the  intrinsic  excellence  of  the  easy  questions  must  in  gen- 
eral be  equal  to  that  of  the  harder  ones,  for  the  same  reason.  It 
is  conceivable  that  a  test  might  be  constructed  in  such  a  manner 
that  a  systematic  change  in  the  intrinsic  excellence  of  questions, 
going  from  easy  to  hard,  might  just  be  balanced  by  a  systematic 
change  in  the  increments  of  difficulty,  proceeding  in  the  opposite 
direction,  but  such  a  balance  is  not  likely  to  be  achieved  in  practice. 

From  the  standpoint  of  reliability,  the  intrinsic  excellence  of  a 
question  may  be  judged  roughly  by  its  biserial  correlation  with 
the  total  test  score.  Strictly  speaking,  this  intrinsic  excellence 
should  be  defined  as  the  partial  biserial  regression  of  the  partic- 
ular question  on  the  total  score,  eliminating  all  the  other  ques- 
tions, but  the  computation  of  such  a  regression  coefficient  involves 
the  fourfold  point-correlation  of  every  question  with  every  other 
question,  a  task  which  is  in  practice  prohibitive.  The  specific 
excellence  of  the  question — its  biserial  correlation  with  an  out- 
side criterion — may  be  taken  in  most  cases  as  equivalent  to  its 
intrinsic  excellence  for  scaling  purposes. 

A  speed  test  should  consist  of  questions  of  approximately  equal 
difficulty,  or  if  this  is  impossible,  the  easy  and  hard  questions 
should  be  interspersed  in  random  order  rather  than  arranged  in 
order  of  difficulty,  since  hard  questions  represent  greater  incre- 
ments of  difficulty  than  do  easy  ones  from  the  standpoint  of  speed, 
if  the  test  is  scored  according  to  the  number  correctly  answered. 
In  fact,  it  would  appear  that  the  only  tests  in  which  the  questions 
should  properly  be  arranged  in  order  from  easy  to  hard  are  pure 
power  tests — perhaps  only  those  designed  to  be  applied  without 
time  limits. 

Comparable  tests.  Let  us  assume  that  we  have  only  two 
forms  of  a  test;  that  these  forms  consist  of  questions  of  approx- 
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imately  equal  intrinsic  excellence;  that  on  each  form  there  are 
about  the  same  number  of  easy  and  hard  questions,  so  that  the 
increments  of  difficulty  may  be  assumed  to  take  no  systematic 
trend,  or  (if  the  test  has  a  time  limit  short  enough  to  cause  notice- 
able differences  between  scores  achieved  under  this  limit  and 
scores  achieved  when  there  is  no  such  limit)  that  the  questions 
do  not  vary  greatly  in  difficulty  and  are  arranged  in  random  order 
in  this  respect;  and  that  the  number  of  questions  on  each  form  is 
fairly  large,  say  50  or  more.  In  this  case  any  observed  discrep- 
ancy between  the  mean  score  differences  of  the  two  forms,  obtained 
from  groups  or  sub-groups  of  different  ability,  will  probably  give 
no  useful  information  regarding  the  relative  magnitudes  of  the 
units  of  measurement  in  the  forms,  and  we  may  just  as  well  take 
these  units  as  equal  or  as  proportional  to  the  corresponding 
standard  deviations.  Small  differences  in  mean  score  increments 
between  successive  ages  or  grades  are  probably  in  this  case  to  be 
attributed  more  to  differences  between  the  groups  used  in  deriv- 
ing the  norms  than  to  differences  in  the  units  of  measurement, 
and  slight  discrepancies  in  the  mean  score  differences  between  the 
upper  and  lower  quarters  of  a  group  may  be  attributed  to  errors 
of  measurement,  unless  the  group  is  quite  large.  Hence  for  prac- 
tical purposes,  formulas  (10)  and  (15)  may  be  recommended  for 
estimating  the  reliabilities  of  fairly  comparable  tests.  If  three  or 
more  forms  are  available,  formulas  (17),  (18),  (19),  (20),  (21), 
and  (25)  are  still  to  be  preferred,  even  though  the  forms  are  fairly 
comparable.  Cases  in  which  tests  measure  in  the  same  basic 
units  but  with  markedly  different  reliabilities  are  likely  to  be  rare 
in  practice.  For  this  reason  formulas  (8),  (9),  (14),  (22),  and  (26) 
are  probably  of  theoretical  interest  only. 

A  reasonable  check  on  the  comparability  of  tests  is  to  compare 
their  variances  and  covariances,  which  should  all  in  this  case  be 
equal  respectively  to  one  another.  This  is  not  a  rigorous  criterion, 
since  it  is  possible,  though  highly  improbable,  that  the  forms 
might  differ  in  their  fundamental  units  of  measurement  in  one 
direction  and  in  their  test  errors  in  the  other  in  such  a  manner  as 
to  make  their  variances  equal.  If  all  the  covariances  are  equal, 
the  tests  necessarily  measure  in  the  same  basic  units.  Tests  may 
be  comparable  even  though  their  means  are  different,  provided 
that  there  are  no  scores  close  to  zero  or  perfection  on  any  form, 
and  the  variances  and  covariances  are  equal  respectively  to  one 
another.    For  example,  it  would  be  quite  simple  to  increase  the 
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obtained  mean  score  on  one  form  by  adding  a  number  of  questions 
so  easy  that  practically  everyone  taking  the  test  could  answer 
them  all  correctly,  without  affecting  its  comparability  to  the 
other  forms. 

Estimated  reliabilities  of  comparable  tests.  It  is  often 
necessary  to  predict  the  reliability  of  the  sum  of  several  compar- 
able tests  from  a  knowledge  of  a  few,  or  to  estimate  the  reliability 
of  one  form,  without  specifying  which  one,  from  a  knowledge  of 
several.    From  (23), 

R„  =  np,i/((<r?+<Tf)/2  +  (n-l)pH).  (27) 

Reliability  of  the  sum  of  n  comparable  tests  estimated  from 
a  knowledge  of  two. 

Assuming  that  al  =  a\,  and  dividing  numerator  and  denominator 
by  this  value, 

R„  =  nrn/(l  +  (n-l)rii).  (28) 

This  is  the  Spearman-Brown  formula  used  as  an  instrument  of 
prediction.  The  only  difference  between  (28)  and  (24)  is  that  the 
average  intercorrelation  in  the  latter  is  replaced  by  the  single 
known  correlation.  Since  predictions  of  this  sort  are  possible  only 
with  comparable  tests,  (27)  is  only  slightly  superior  to  (28). 
Neither  has  any  very  definite  meaning  if  a\  differs  very  markedly 
from  af. 

Generalizing  (27), 

R„  =  [2nSpi,]-4-[(m-l)2(r?+2(n-l)Sp3J.  (29) 

Reliability  of  the  sum  of  n  comparable  tests,  estimated  from 
a  knowledge  of  m  of  them. 

The  symbol  2  here  represents  a  summation  from  1  to  m,  and  S  a 
summation  from  1  to  m(m  — 1)/2.  Assuming  all  variances  and 
covariances  equal,  and  dividing  numerator  and  denominator  by 
the  variance, 

Rn  =  nr/(l  +  (n-l)r).  (30) 

This  is  the  Spearman-Brown  formula  again.  The  value  of  r  is  to 
be  taken  here  as  the  average  intercorrelation  among  the  m  known 
forms  of  the  test.  Otherwise  (30)  is  identical  with  (24)  and  (28). 
If  we  already  know  the  value  of  R^,  we  may  write. 
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Rn  =  [nRJm]  -  [1  +  ((n/m)  -  1)R^]. 

R„  =  nRJ(m  +  (n-m)RJ.  (31) 

Reliability  of  the  sum  of  n  comparable  tests,  estimated  from 
a  knowledge  of  the  reliability  of  the  sum  of  m  of  them. 

This  result  was  first  published  by  Spearman  (1910).  It  is  the 
most  general  form  of  what  is  usually  called  the  Spearman-Brown 
formula,  although  in  this  form  it  was  published  only  by  Spearman. 

If  n  =  l  in  (29),  we  obtain, 
Rj=SpiJ((m-l)2a?)  (32) 

Reliability  of  one  unspecified  form  of  a  test,  estimated  from 
a  knowledge  of  m  comparable  forms. 

Setting  m  =  2  in  (32), 

Rj=2pJ(aJ  +  cr?)  (33) 

Reliability  of  one  unspecified  form  of  a  test,  estimated  from 
a  knowledge  of  two  comparable  forms. 

Formula  (33)  should  give  a  slightly  better  estimate  than  (10),  the 
ordinary  reliability  coefficient.  It  differs  from  r^  only  in  that  its 
denominator  is  the  arithmetic  mean  of  the  two  variances  instead 
of  their  geometric  mean.  Since  with  comparable  tests  the  vari- 
ances are  approximately  equal,  this  difference  will  be  negligible  in 
most  practical  situations. 

Reliabilities  of  strictly  comparable  tests.  If  we  have  sev- 
eral tests  which  are  closely  comparable,  and  whose  means  in  addi- 
tion are  approximately  equal,  we  shall  call  these  strictly  com- 
parable tests,  to  distinguish  them  from  other  comparable  tests 
whose  means  are  not  necessarily  equal.  With  such  tests  we  may 
obtain  somewhat  better  estimates  of  the  reliabilities  of  unspeci- 
fied single  forms  and  of  sums  of  several  forms  than  are  given  by 
formulas  (10),  (15),  (24),  (27),  (28),  (29),  (30),  (31),  (32),  and 
(33).  In  this  case  we  may  assume  that  it  is  immaterial  which 
questions  occur  in  which  forms  of  the  test,  and  in  what  order  any 
individual  takes  the  several  forms.  If  we  have  N  individuals 
and  two  forms  of  the  test  given  to  each  of  them,  we  may  then  cal- 
culate a  single  mean  and  a  single  variance  for  the  2N  measures, 
according  to  the  formulas, 

M  =  (2Xi-f2Xi)/2N.  (34) 

a'  =  ilX\  +  ZXf)l2N-U\  (35) 
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We  may  then  calculate  the  covariance  by  the  formula, 

p  =  2XiXi/N-Ml  (36) 

Substituting  in  either  (33)  or  the  usual  product-moment  formula, 

Rj=r,=p/<r^  (37) 

Reliability  of  a  single  form  of  a  test,  estimated  from  two 
strictly  comparable  forms,  the  particular  form  being  un- 
specified. 

The  sjmnbol  ri  (with  a  single  subscript)  will  be  used  to  designate 
a  coefficient  obtained  from  (34),  (35),  (36),  and  (37).  This  coeffi- 
cient is  the  intraclass  correlation.  It  may  be  substituted  for  the 
ordinary  product-moment  intercorrelation  in  (15)  and  (28),  the 
two-variable  and  n-variable  cases  of  the  Spearman-Brown  formula 
used  to  predict  the  reliability  of  the  sum  of  several  comparable 
tests  from  a  knowledge  of  two,  if  these  two  are  strictly  comparable. 
It  has  a  slightly  smaller  sampling  error  than  the  corresponding 
intercorrelation,  and  is  therefore  to  be  preferred  whenever  the 
several  forms  of  the  test  are  strictly  comparable.  If  n  such  forms 
have  been  given  to  each  of  the  N  individuals,  we  may  obtain  in  a 
manner  similar  to  (34),  (35),  and  (36), 

M  =  (2X1+2X2+  .  .  .+2Xn)/nN.  (38) 

<t'  =  (2XH2X1+  .  .  .  +  2X^)/nN-M^  (39) 

P=2(2XiX2  +  2XiX3+   .  .  .+2XiX„+   .  .  .+2X(„_,)Xn) 

^(nN(n-l))— Ml  (40) 

We  have  then, 

R;=v,  =  v!a'  (41) 

Reliability  of  one  unspecified  form  of  a  test,  estimated  from 
a  knowledge  of  n  strictly  comparable  forms. 

The  quantity  rn  is  the  generalized  intraclass  correlation.  It  may 
be  substituted  for  the  average  intercorrelation  in  formulas  (24), 
(30),  and  (31),  the  variations  of  the  Spearman-Brown  formula 
for  cases  in  which  the  scores  on  more  than  two  forms  of  the  test 
are  known.  Its  sampling  error  is  smaller  than  that  of  the  cor- 
responding intercorrelation.  If  n  becomes  at  all  large,  the  com- 
putation of  p  from  (40)  becomes  exceedingly  laborious.  Further- 
more, it  has  been  shown  by  Fisher  (1928),  Ch.  7,  that  (41)  gives 


AND  CORRELATION  25 

a  slightly  biassed  estimate  of  the  true  value  of  r^  unless  N  is  quite 
large  as  compared  with  n.  Therefore  rn  should  in  practice  be  com- 
puted by  another  method.  This  computation  is  discussed  further 
in  Appendix  I. 


» 


CHAPTER  III 
Validity 

General  meaning  of  validity.     The  validity  of  a  test  is, 
broadly  speaking,  the  efficiency  with  which  it  measures  the  trait 
it  was  designed  to  measure.    This  involves  two  errors,  as  noted 
previously,  in  addition  to  the  test  error  and  the  response  error. 
First,  the  test  may  contain  a  specific  non-chance  factor  not  found 
in  the  trait,  and  second,  the  trait  may  contain  a  specific  non- 
chance  factor  not  found  in  the  test,  i.e.,  the  test  may  sample 
either  more  or  less  than  the  trait,  or  both.    But  in  estimating  the 
validity  of  a  test  we  must  also  consider  the  test  error  and  the 
response  error,  both  in  the  test  and  in  the  criterion.    A  test  is 
fundamentally   valid   if   there  is  neither  sort  of  specific   non- 
chance  factor  present,  but  its   practical  validity   depends  also 
on  its  reliability.    Suppose,  for  example,  that  we  desire  to  pre- 
dict success  in  college.    This  might  be  defined  by  the  faculty  as 
the  point-hour  ratio,  which  is  obtained  by  multiplying  the  num- 
ber of  hours  of  A  achieved  by  the  student  by  4,  of  B  by  3,  of  C 
by  2,  of  D  by  1,  and  of  F  by  0;  and  dividing  the  sum  of  these 
points  by  the  number  of  hours  per  week  carried.    The  trait  which 
the  test  is  designed  to  measure  is  then  the  sum-total  of  all  syste- 
matic factors  which  influence  the  point-hour  ratio.     The  latter 
may  of  course  be  influenced,  and  quite  largely,  by  chance  factors, 
which  from  their  very  nature  are  unpredictable.    The  reliability 
of  the  criterion  may  be  obtained  from  the  point-hour  ratios  of 
successive  semesters  or  quarters  by  the  methods  outlined  in  the 
previous  chapter.     The  fact  that  some  applicants  are  refused 
admittance  and  that  others  drop  out  along  the  way  introduces 
complications  in  the  practical  situation  that  may  for  present  pur- 
poses be  neglected. 

Fundamental  factor  pattern  for  validity. 

Let  Xi  and  Xj  be  scores  on  two  forms  of  a  test, 
X2  and  Xii,  equivalent  criterion  measurements, 
x„,  the  underlying  ability  measured  by  the  test, 
x„,   the   ability   underlying   the   criterion   measurements, 
which  is  the  ability  we  are  trying  to  measure.    Then, 


AND  CORRELATION  27 

Xi  =  CiX„+5i, 

X2  =€2X0,  +52, 
Xjj       CjjXoo  -t-Oji. 

If  x„  is  correlated  with  x„,  we  may  consider  this  correlation  to  be 
the  result  of  a  factor  common  to  the  corresponding  measurements. 

Let  a  be  a  factor  common  to  all  four  measurements, 

b,  a  factor  common  to  x,  and  X;  but  outside  X2  and  Xa, 
d,  a  factor  common  to  X2  and  Xji  but  outside  Xj  and  Xj. 

We  then  have  the  following  factor  pattern, 

Xi=Cia+Cib+5i, 
Xi=Cia+Cib+5i, 

X2=C2a+C2d+52=C2X„+52, 
Xii  =  Ciia+Ciid  +5ii  =  CiiXa,  +5^. 

The  value  Xa,  is,  as  stated  above,  the  sum  of  all  the  abilities  which 
cause  systematic  variation  in  the  criterion  scores.  Since  we  have 
assumed  that  the  two  forms  of  the  test  measure  the  same  funda- 
mental abilities,  the  a  and  b  factors  in  Xi  and  Xj  may  be  multi- 
plied by  the  same  c's;  and  similarly  for  the  a  and  d  factors  in  X2 
and  Xij.  It  is  assumed  that  Xi  and  Xi  contain  the  same  relative 
proportions  of  a  and  b;  an  assumption  implied  in  the  previous 
one,  that  the  tests  and  criterion  measures  sample  the  same  respec- 
tive fundamental  abilities,  i.e.,  that  x^  is  the  same  in  the  two  tests 
and  Xa,  is  the  same  in  the  two  criterion  measurements.  Making 
the  additional  assumptions  implied  in  the  factor  pattern,  namely 
that  a,  b,  d,  and  all  the  5's  are  uncorrelated  with  one  another,  we 
have  in  succession. 


Furthermore, 


Pli=CiCiaa+CiCi(Tb, 

Pl2=CiC2(ra» 

Pl.ii=CiCiiO-a, 

Pi2=CiC2<ra, 
2 
Pi.ii       "fiii^at 

P2.ii  =  C2Cii(ra-f-C2Cii(rd  =C2Cii(ri 


Pl2  =  2(X,(C2a+C2d  +  52))/N=C2Pia, 

Pi.ii  =  2(Xi(Ciia+Ciid+5ii))/N  =  CiiPia, 

Pi2  =  2(Xi(c2a +C2d +52))/N  =  CzPia, 

Pi.ii  =  2  (Xi  (Ciia + Ciid + 5ii) )  /N  =  CiiPia- 


(1) 


(2) 
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These  equations  hold  only  under  the  above  assumption  that  a,  b, 
d,  and  all  the  5's  are  uncorrelated  with  one  another.  It  is  very 
important  that  the  two  forms  of  the  test  be  given  at  such  an  inter- 
val that  there  will  not  be  any  recurrent  response  errors,  which 
would  introduce  a  spurious  b-factor.  The  two  criterion  measures 
must  likewise  be  free  from  any  spurious  d-factors.  In  the  case 
of  point-hour  ratios,  for  example,  we  should  not  take  the  sum  of 
the  first-semester  ratios  as  one  form  and  the  sum  of  the  second- 
semester  ratios  as  the  other,  since  many  courses  run  for  a  full  year, 
and  the  "halo  effect"  in  the  second-semester  marks  in  such  cases 
would  enter  as  a  spurious  d-factor.  A  better  method  would  be 
to  take  the  sum  of  the  first,  fourth,  fifth,  and  eighth-semester 
ratios  as  one  form,  and  the  sum  of  the  second,  third,  sixth,  and 
seventh-semester  ratios  as  the  other. 

Statistical  estimation  of  fundamental  validity.     If  the 

fundamental  validity  of  the  test  is  perfect,  there  will  be  no  b-factor 
and  no  d-factor,  and  we  obtain  at  once  from  equations  (1), 

PliP2,ii=Pl2Pi,ii  =  Pl.iiPi2.  (3) 

Criterion  of  perfect  fundamental  validity. 

This  is  the  well-known  tetrad  relation,  expressed  in  terms  of  the 
covariances.  If  the  fundamental  validity  is  zero,  there  will  be  no 
a-factor,  and  the  lower  the  fundamental  validity,  the  smaller  will 
be  the  variance  of  the  a-factor  in  comparison  with  those  of  the 
b-factor  and  the  d-factor.  All  the  p's  except  pu  and  pa.ii  must, 
therefore,  become  smaller,  relatively  to  these  two,  as  cl  becomes 
smaller  relatively  to  al  and  al.  Hence,  as  the  coefficient  of  funda- 
mental validity  we  may  write, 

Voo  =  (Pl2Pi,ii/PliP2.ii)^  =  (Pl.iiPi2/PliP2.ii)^- 

Combining  the  two  right-hand  expressions, 

V»  =  (Pl2Pl.iiPi2Pi.ii)^/(PliP2.ii)^.  (4) 

Coefficient  of  fundamental  validity. 

This  value  is  a  special  case  of  the  correlation  corrected  for  atten- 
uation. The  roots  are  taken  in  order  that  the  coefficient  may 
vary  with  the  variances  themselves,  rather  than  with  their 
squares.  The  two  right-hand  expressions  in  the  equation  imme- 
diately preceding  (4)  should  give  the  same  value,  within  their 
sampling  errors,  according  to  our  analysis.    This  gives  a  partial 
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check  on  the  assumption  that  the  two  forms  of  the  test  and  the 
two  criterion  measures  sample  respectively  the  same  abilities. 
This  essential  equality  may  be  more  simply  expressed, 

Pl2Pi.ii/Pl.iiPi2  =  1.  (5) 

Partial  check  for  equivalence  of  tests  and  criterion  measures. 
This  is  a  special  case  of  the  tetrad  ratio. 

The  coefficient  of  fundamental  validity  will  vary  from  zero  for 
no  fundamental  validity  to  unity  for  perfect  fundamental  valid- 
ity. The  correlation  coefficients  may  be  used  instead  of  the 
covariances  in  (4)  and  (5)  without  changing  either  its  value  or 
the  value  of  its  standard  error. 

Practical  validity.  Even  if  the  fundamental  validity  of  a  test 
is  perfect,  its  practical  validity  may  still  be  very  low,  due  to  its 
lack  of  reliability.  Practical  validity  may  be  defined  in  general 
terms  as  the  accuracy  with  which  a  test  measures  the  ability 
underlying  a  specified  criterion.  It  should,  therefore,  vary  with 
both  the  fundamental  validity  and  the  reliability  of  the  test,  but 
not  with  the  reliability  of  the  criterion  measures.  We  must 
assume  that  the  latter  are  fundamentally  valid,  however,  i.e., 
that  all  systematic  causes  of  variation  in  them  (a-factors  and 
d-f actors)  shall  be  included  in  Xo,.  The  practical  validity  is  to  be 
distinguished  from  the  predictive  value  of  the  test,  which  depends 
on  the  reliability  of  the  criterion  measures  as  well  as  on  the  funda- 
mental validity  and  reliability  of  the  test. 

Estimation  of  practical  validity.  It  has  been  shown  previ- 
ously that  the  reliability  of  a  test  is  equal  to  the  square  of  its  cor- 
relation with  the  fundamental  ability  underlying  it.  From  the 
general  definition  of  practical  validity,  and  by  analogy  with  the 
reliability  coefficient,  we  may  define  the  coefficient  of  practical 
validity  as  the  square  of  the  correlation  between  the  test  score 
and  the  fundamental  ability  underlying  the  criterion  scores. 
Then  from  the  factor  pattern, 

2  2  2/22 

Tloo  =ri(a+d)  — Pla/^1'^"' 

From  (2), 


and  from  (1), 


2 

Pl2Pl.ii  —  C2CiiPia> 


2 
Pa.ii  —  C2Cii<Too, 
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SO  that, 

V,  =rii  =Pi2Pi.ii/P2.iiC^?  =ri2ri.H/r2.ii.  (6) 

Coefficient  of  practical  validity  of  Form  A. 

It  may  be  seen  that  to  determine  the  practical  validity  of  a 
test  we  require  only  the  one  form,  but  we  also  require  two  criterion 
estimates.  The  latter  need  not  be  comparable,  but  their  errors  of 
measurement  must  be  uncorrelated.  The  coefficient  of  practical 
validity  resembles  the  square  of  Spearman's  correlation  between 
a  test  and  the  general  factor  (1927),  Appendix  p.  xvi.  But  Xco  in 
our  case  contains  not  only  the  general  factor  (here  a),  but  also 
the  group  factor  in  the  two  criterion  measures  (here  d).  This 
shows  clearly  the  fallacy  involved  in  interpreting  the  function  in 
this  fashion  in  systems  in  which  the  theory  of  two  factors  does 
not  hold. 

If  there  is  no  d-factor,  the  coefficient  of  practical  validity  may 
be  shown  to  be  equal  to  the  ratio  of  the  variance  of  the  common 
factor  to  that  of  the  total  test  scores.    Setting  d  =  0  in  (1),  we  find, 

PnPi.u/P2.u  =  c^L 
and  from  (6) 

Yt=cWjal.  (7) 

In  the  formulas  for  practical  validity,  no  use  has  been  made  of 
Xi,  these  formulas  being  concerned  only  with  a  single  test  and  the 
two  criterion  measures.    Hence  v^^e  have  at  once, 

Vi  =  Pi2Pi.ii/P2.ii<^f  =  ri2ri.ii/r2.ii.  (8) 

Practical  validity  of  Form  B. 

V  (1  +i)  =  P(l  +i)2P(l  +i)ii/P2,iiO'(l  +i) 

=  r(n-i)2r(n-i)ii/r2,ii.  (9) 

Practical  validity  of  Form  A  plus  Form  B  in  terms  of  com- 
bined scores. 

Alternatively, 

^(14-1)0,  =  (2Xa,(Xl4-Xi))    /CcO-d^.;) 
=  (Pla+Pia)    /o'ajO'd+ij    . 


From  (2), 


Pl2Pl.ii=C2CiiPia, 
Pi2Pi.ii=C2CiiPL 
PnPi.ii  =  Pl.uPi2  =  C2CiiPiaPia, 
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SO  that, 

(Pla+Pia)^  =  CPl2Pl.ii+Pi2Pi.u+Pl2Pi.ii+Pl.iiPi2)/C2Cu. 

Also  from  (1), 

P2.ii=C2Cu<ri, 

so  that, 

V(i+i)  =r(i+i)„  =  (Pi2Pl.ii+Pi2Pi,ii+Pl2Pi,ii 

+  Pl.iiPi2)/(P2.u(<T?  +  <rf+2pH)).  (10) 

Practical  validity  of  Form  A  plus  Form  B  in  terms  of  the 
separate  scores. 

Significance  of  measures  of  validity.  The  practical  validity 
of  a  test  is  the  most  important  feature  of  its  real  usefulness.  Even 
though  the  fundamental  validity  may  not  be  perfect,  the  practical 
vahdity,  due  to  the  high  reliability  of  the  test,  may  well  be  greater 
than  the  reliability  of  the  criterion.  If  the  fundamental  validity 
is  perfect,  the  practical  validity  should  equal  the  reliability  of  the 
test.  It  is  important  to  note  that  in  validating  a  test,  the  cri- 
terion need  not  be  very  reliable,  provided  it  is  fundamentally  valid. 
Thus  in  the  case  of  college  grades,  if  the  point-hour  ratio  is  taken 
by  fiat  as  the  definition  of  academic  success,  a  test  might  well 
be  constructed  whose  reliability  would  be  so  high  as  compared  to 
the  reliability  of  teacher's  marks,  that  it  would  be  abetter  measure 
of  the  systematic  mental  traits  underlying  academic  success  than 
the  total  academic  record — the  point-hour  ratio  for  the  entire 
period  of  attendance — and  this  in  spite  of  a  fundamental  validity 
definitely  (though  not  greatly)  less  than  perfect. 

If  the  object  of  giving  a  test  is  simply  to  predict  academic  suc- 
cess as  measured  by  the  point-hour  ratio,  say,  then  its  predictive 
value  will  be  given  by  the  raw  correlation  between  the  test  scores 
and  the  criterion  measures.  This  correlation  is  of  course  depend- 
ent upon  the  reliabilities  of  both  the  test  and  the  criterion.  Its 
value,  as  will  be  shown  in  the  next  chapter,  can  only  approach  the 
geometric  mean  of  the  two  reliability  coefficients  as  a  maximum. 


CHAPTER  IV 
Correction  for  Attenuation 

Fundamental  factor  pattern.  The  basic  problem  of  this 
chapter  is  the  estimation  of  the  correlation  between  the  true  abil- 
ities underlying  two  sets  of  test  scores.  If  we  have  two  forms  of 
the  test  measuring  each  trait,  we  may  assume  the  following  factor 
pattern  to  hold: 

Xi=CiX„+5i. 

X2  =C2X„  +  52. 

Xji  ^  CjjX„  ~r  ojia 

The  problem  is  then  to  determine  the  value  of  r"<o.  If  we  assume 
that  of  the  values  x„,  x^,,  5i,  5i,  82,  and  da,  Xa,  and  x„  alone  are 
correlated,  we  obtain, 

Pli=C,Cia^, 
Pl2=CiC2P„„, 
Pl.ii       "l^iiPi'uf  \  /-IN 

Pi2  =  CiC2Pa>a)> 

Pi.ii       ^it'iiPoDoo 
P24i  =  C2Ciia^. 

Experimental  implications.  By  giving  all  four  of  the  tests 
at  different  times,  the  response  errors  could  all,  theoretically,  be 
rendered  independent.  But  this  is  in  general  impractical.  If  the 
Form  A  tests  be  given  at  one  time  and  the  Form  B  tests  at  another, 
however,  it  will  still  be  possible  to  get  an  unbiassed  estimate  of 
r„„.  In  this  case,  the  response  errors  in  Xi  and  X2  will  be  cor- 
related, as  will  those  in  Xj  and  x^,  but  we  may  take  5i  and  5ii  to  be 
uncorrected,  and  the  same  for  5i  and  82. 

Statistical  estimation  of  the  correlation  between  under- 
lying abilities.  From  the  above  considerations  and  from  (1), 
we  find, 

PliP2.ii=CiCiC2Ciiffl<r^, 

2 

Pl.iiPi2  —  CiCiC2CiiPoo„> 

and, 

r„„  =  Poo  J(T„(T„  =  (Pl.iiPi2/PliP2.ii)^.  (2) 
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Alternatively,  dividing  numerator  and  denominator  by  ci<ri<T2<^n> 

r»a,  =  (ri.iiri2/r,ir2.ii)^.  (3) 

Correlation  corrected  for  attenuation. 

This  formula  has  been  noted  by  Yule  (1927),  pp.  213-14,  as  being 
unaffected  by  correlated  errors  in  5i  and  82  or  5i  and  5ii.  His  deri- 
vation of  it,  however,  was  based  on  a  factor  pattern  without  the 
c's,  assuming  both  measures  of  each  trait  to  be  in  the  same  units. 
The  above  demonstration  shows  that  the  formula  is  in  fact  inde- 
pendent of  this  assumption. 

If  we  assume  that  all  the  errors  are  uncorrected,  as  would  be 

reasonable  if  all  the  tests  had  been  given  at  different  times,  we 

obtain  from  (1), 

2 

Pl2Pi,ii  —  CiCiC2CijPco„. 

From  this  equation  and  those  immediately  preceding  (2), 

r»a.  =  (ri2ri.ii/riir2.ii)^.  (4) 

Taking  the  geometric  mean  of  (3)  and  (4), 

r<»„  =  (ri2ri.iiri2ri.ii)^/(riir2,ij)^.  (5) 

Correlation  corrected  for  attenuation,  assuming  all  errors 
to  be  uncorrected. 

We  might  also  combine  (3)  and  (4)  by  taking  their  arithmetic 
mean,  in  which  case, 

r.„  =  (ri2ri,i+r,.Hri2)^/''2riir2,i)^.  (6) 

Alternate  form  of  the  correlation  corrected  for  attenuation, 
assuming  all  errors  to  be  uncorrected . 

Alternative  formulas  for  the  correlation  corrected  for 
attenuation.  If  we  define  the  reliability  of  a  test,  as  in  Chapter 
I,  by  the  relation, 

Ri  =Cicra>/<''i, 

we  find, 
Similarly, 

<^«  =  R20'2/C2» 

and  from  (1), 

pi«  =  Pi2/Cic|, 
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SO  that, 

roo«  =  Pi2/<^i<^2RiR2» 
and, 

r»„  =  ri2/(RiR2)^.  (7) 

This  formula  will  not  in  general  be  used  as  a  substitute  for  (2)  or 
(3)  in  practical  computations,  because  of  the  difficulties  that 
arise  in  evaluating  Rj  and  R2.  From  it,  however,  we  may  deduce 
one  important  relationship.    If  r„„  is  equal  to  unity, 

r,2  =  (RiR2)^  (8) 

Upper  limit  of  the  obtained  correlation  between  two  fallible 
tests. 

This  limit  is  not  absolute— it  may  be  exceeded  by  chance — but 
for  practical  purposes  it  gives,  as  noted  in  the  previous  chapter, 
the  upper  limit  of  the  ability  of  a  fallible  test  to  predict  the  score 
on  another  fallible  test  or  criterion  measure. 

If  Ci/ai=cjai  and  C2h2-^nl<^nt  we  have  from  equation  (10)  of 
Chapter  II, 

Ri  =  Ri  =  rii, 

R2  —  Rii  =  ^2,VLt 

and  from  (7), 

ra,„=ri2/(riir2.ii)^.  (9) 

By  a  line  of  reasoning  exactly  like  that  leading  to  (7)  and  (9), 
we  obtain, 

rooa==ri_ii/(riir2,ii)  ', 
ra,„  =  ri2/(riir2,ii)  ', 
roow  =  ri_ii/(riir2,ii)  ', 
and  averaging  these  equations  and  (9), 

r»„  =  (ri2+ri.u+ri2+ri.H)/(4(riir2.ii)^).  (10) 

Correlation  corrected  for  attenuation,  when  all  the  errors  of 
measurement  are  uncorrected  and  the  two  forms  of  each 
test  are  equally  reliable. 

Formulas  (9)  and  (10)  were  given  by  Spearman  in  his  original 
paper  (1904  proof).    He  made  the  statement  there, 

"Should  circumstances  happen  to  render,  say  Xi,  much  more 
accurate  than  X;,  then  the  correlations  involving  Xi  will  be  con- 
siderably greater  than  those  involving  X;.      In  such  case,  the 
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numerator  of  the  above  fraction  must  be  formed  by  the  geomet- 
rical mean  instead  of  by  the  arithmetical  mean;  hereby  the  acci- 
dental errors  of  the  respective  observations  cease  to  eliminate  one 
another  and  therefore  double  their  final  influence;  they  also 
introduce  an  undue  diminution  of  the  fraction."* 

Yule's  proof  of  (5),  cited  by  Brown  (1909)  and  (1910  exp.),  and 
Spearman  (1910)  v/as  the  first  to  show  clearly  the  assumption 
that  the  errors  of  measurement  are  uncorrelated.  It  was  based, 
as  was  his  derivation  of  (3),  on  a  factor  pattern  without  the  c's, 
assuming  these  to  be  equal  for  each  test.  With  such  a  factor  pat- 
tern, (10)  can  only  be  derived  on  the  assumption  of  comparable 
tests  of  both  abilities.  The  present  derivation  shows  that  the 
assumption  of  equal  units  is  not  necessary  for  the  applicability  of 
(3)  and  (5),  and  that  the  assumption  of  comparable  tests  is  not 
basic  to  (10). 

In  practice,  (3)  and  (5)  will  be  the  most  useful  for  estimating 
true  correlations,  as  they  rest  on  only  the  single  assumption  that 
certain  (or  all)  of  the  errors  of  measurement  are  uncorrelated  with 
one  another  and  with  the  true  abilities. 

Checks  for  the  correctness  of  assumptions.  There  is  no 
adequate  check  for  the  assumption  that  errors  of  measurement 
are  uncorrelated.  Brown  (1909)  and  (1910  obj.),  has  proposed 
two.    The  first  of  these  may  be  written  in  the  present  notation, 

Pi.ii— Pi2  =  0  (within  the  sampling  error).  (11) 

Now  assuming  that  the  errors  of  measurement  are  uncorrelated, 
we  have  from  (1), 

Pl.ii  ="  CiCiiPoD«» 
Pi2  —  CiC2Poou, 

whence  we  see  that  (11)  holds  only  under  the  additional  assump- 
tion that  CiCii  =  CiC2,  which  was  found  not  to  be  necessary  in  the 
derivation  of  (3)  and  (5).    Brown's  second  check  may  be  written, 

r(x  -x)(x  -X  )=0   (within  the  sampling  error).         (12) 

1       i      2       ii 


*  In  the  above  quotation,  the  notation  here  used  has  been  substituted  for 
that  employed  by  Spearman.  In  this  paper  of  1904,  no  proof  was  given,  but 
three  years  later  (1907)  another  paper  appeared,  giving  a  lengthy  and  some- 
what obscure  proof  of  formula  (5).  In  this  paper  Spearman  noted  that  if  the 
two  sets  of  measurements  in  each  case  were  equally  reliable,  formula  (10)  was 
to  be  preferred. 
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Expanding  this  we  have, 

(Pl2+Pi.ii-Pl.n-Pi2)/(<^(x   -x)<r(x   -x   ))  =  0. 

1       i         2       ii 

This  expression  will  vanish  with  its  numerator.  Making  the  orig- 
inal assumption  of  zero  correlation  between  errors  of  measure- 
ment again,  we  note  from  (1)  that  (11)  will  vanish  only  if 
CiC2+CiCii=c,Cii+CiC2,  a  condition  which  again  is  not  necessary  to 
the  derivation  of  (3)  and  (5).  Neither  of  these  checks  is  in  general, 
therefore,  a  necessary  condition  for  the  applicability  of  either  (3) 
or  (5).  But  if  the  two  forms  of  each  test  measure  in  the  same 
units,  both  of  them  are  applicable,  and  in  fact, 

Pj2  =  piii  =  pj2  =  pyi  (within  the  sampling  errors)  (13) 

General  check  for  the  assumption  that  errors  of  measure- 
ment are  uncorrelated,  applicable  when  the  two  forms  of 
each  test  are  known  to  measure  in  the  same  units. 

This  check  contains  both  of  Brown's  as  special  cases,  and  is 
founded  on  the  same  basic  assumption  of  equality  of  units  of 
measurement  in  the  two  forms  of  each  test,  that  underlies  Yule's 
simplified  factor  pattern,  which  Brown  employed  in  deriving  his 
checks.    If  (13)  holds,  (11)  and  (12)  of  necessity  hold  also. 

As  a  check  on  the  assumption  that  there  are  no  correlated 
response  errors,  which  is  the  essential  condition  for  the  applica- 
bility of  (5),  we  have  from  (3)  and  (4), 

ri2ri.ii/ri.iiri2  =  l.  (14) 

Check  for  the  assumption  that  response  errors  are  uncorre- 
lated. 

This  formula  is  a  special  case  of  the  tetrad  ratio. 


iSy 


CHAPTER  V 
The  Theory  of  Factors 

Problem  and  limitations.  If  a  number  of  tests  have  been 
given  to  a  group  of  individuals,  we  may  desire  to  know  the  nature 
and  relative  contributions  of  the  various  basic  mental  abilities  un- 
derlying the  test  performances.  In  previous  chapters  we  have  dealt 
with  special  aspects  of  this  problem — aspects  in  which  the  nature 
of  the  abilities  was  known,  in  the  sense  that  the  fundamental  fac- 
tor pattern  could  be  written  down  from  a-priori  considerations. 
In  the  more  general  cases  now  to  be  considered,  the  factor  pattern 
cannot  be  determined  beforehand.  In  fact,  the  central  problem 
of  the  theory  of  factors  is  to  find  the  simplest  factor  pattern  con- 
sistent with  a  given  set  of  data,  and  to  effect  an  analysis  of  the 
variances  of  the  different  tests  in  terms  of  the  factors  of  this 
pattern. 

It  is  impossible  to  start  with  a  set  of  test  data  and  work  back- 
wards to  a  determination  of  the  factor  pattern.  There  are  always 
an  indefinite  number  of  such  patterns  consistent  with  any  set  of 
data.  For  practical  purposes,  however,  we  may  use  the  law  of 
parsimony,  and  assume  that  the  simplest  of  these  is  the  "true" 
factor  pattern.  Then  in  ordinary  situations  we  shall  first  assume 
some  very  simple  pattern,  derive  the  equations  of  consistency, 
and  apply  them  to  the  data.  If  they  fit,  we  shall  go  no  further. 
If  they  do  not,  we  shall  postulate  some  slightly  more  complicated 
pattern,  and  repeat  the  process.  As  a  matter  of  fact,  however,  it 
will  be  found  that  the  same  equations  may  often  be  used  to  test 
several  hypothetical  factor  patterns.  The  present  discussion  will 
be  limited  to  two  of  the  simplest  of  these  patterns. 

The  theory  of  two  factors.  This  provides  the  simplest  of  all 
practical  factor  patterns.  It  assumes  that  among  any  set  of  tests 
not  obviously  similar,  there  will  exist  one  single  general  factor 
common  to  them  all,  and  that  each  test  in  addition  will  possess 
a  specific  factor  independent  of  all  other  specific  factors  and  of 
the  general  factor.  The  theory  has  been  vigorously  upheld  by 
Spearman  and  his  students  for  over  a  quarter-century.  For  four 
tests,  the  factor  pattern  may  be  written, 
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X,  =C,g+S,. 

X2=C2g+S2. 
X3=C3g+S3. 
X4=C4g+S4. 

The  value  g  is  the  general  or  common  factor,  and  the  s's  are  the 
specific  factors.    From  this  pattern  we  obtain, 

Pl2=CiC2(rg. 

2 

Pl3  —  CiC30rg. 

2 

Pi4  — CiC4ag. 

P23=C2C3Crg. 

2 
P24— C2C4<^g- 

P34=C3C4ffg. 

from  which  we  find 

P12P34  =  P13P24  =  P14P23  =  CiC2C3C4aJ.  (1) 

Fundamental  tetrad  relation  for  the  theory  of  two  factors. 

The  equality  of  the  three  left-hand  terms  in  (1)  is  the  test  for  the 
consistency  of  a  set  of  data  with  the  theory  of  two  factors.  If  we 
divide  each  of  these  terms  by  <Ti£r2<^3<^4>  we  obtain, 

^121*34  =  ri3r24  =  ri4r23.  (2) 

Tetrad  relation  in  terms  of  correlation  coefficients. 

If  we  divide  each  of  these  in  turn  by  R1R2R3R4,  we  find, 

^aaut^yri        »  co-y* a)ij        ^a>  Vuy' 

Tetrad  relation  in  terms  of  correlations  corrected  for  atten- 
uation. 

Application  of  the  tetrad  relation.  It  has  been  the  custom  of 
most  investigators,  following  Spearman,  to  use  the  relation  of 
(2)  and  derive  therefrom  the  equations, 

ri2r34~ri3r24=0.i 

ri2r34-r,4r23  =  0.>  (4) 

^13^24  ~  ri4r23  =  0.7 

Tetrad  difference  equations. 

There  are  three  more  such  equations  which  may  be  written  by 
reversing  the  signs  of  these,  but  of  the  six,  any  two  except  a  given 
one  and  its  negative  are  sufficient  to  determine  all  the  others. 
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There  is  no  obvious  reason  except  the  historical  one  why  the 
tetrad  differences  should  be  expressed  in  terms  of  raw  correlation 
coefficients  rather  than  in  terms  of  covariances  or  of  correlations 
corrected  for  attenuation.  Any  one  of  these  sets  of  tetrad  dif- 
ference equations  should  equal  zero  according  to  the  two-factor 
theory.  But  their  sampling  errors  are  different,  as  are  also  their 
numerical  values  when  the  theory  of  two  factors  does  not  hold. 
Thus  if  we  write  the  first  equation  of  (4)  in  terms  of  the  covariances 
we  have, 

Pl2P34-Pl3P24=0. 

In  terms  of  the  raw  correlations  this  becomes, 

ri2r34-ri3r24  =  (Pl2P34-Pl3P24)/(o-l«^20-3<^4)  =0. 

The  sampling  error  now  involves  the  errors  not  only  of  the  covari- 
ances, but  also  of  the  standard  deviations.  In  terms  of  the  cor- 
relations corrected  for  attenuation,  we  find, 

=  (Pl2P34-Pl3P24)/(<^lO'2<^3<^4RlR2R3R4)  =0. 

The  sampling  error  here  involves  the  errors  of  the  reliability 
coefficients  in  addition  to  those  of  the  covariances  and  standard 
deviations. 

The  tetrad  ratio.  There  is  another  method  of  writing  the 
tetrad  relation  which  will  give  identical  results,  whether  the  data 
be  given  in  terms  of  covariances,  raw  correlations,  or  correlations 
corrected  for  attenuation.  This  is  the  tetrad  ratio.  We  may 
obtain  from  (1), 

P12P34/P13P24  =  1- 1 

Pl2P34/Pl4P23  =  l.   y  (5) 

P13P24/P14P23  =  1.  ) 

Tetrad  ratio  equations. 

If  we  use  raw  correlation  coefficients  as  our  original  data,  we 
are  simply  dividing  numerator  and  denominator  simultaneously 
by  the  product  of  the  four  obtained  standard  deviations.  These 
will  always  cancel.  In  a  similar  manner,  if  we  take  the  correla- 
tions corrected  for  attenuation  as  the  original  data,  we  are  again 
dividing  numerator  and  denominator  by  the  same  value,  in  this 
case  the  product  of  the  four  obtained  reliability  coefficients.  The 
numerical  value  of  the  tetrad  ratio  is  the  same,  therefore,  in  each 
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case.  This  argument  may  be  applied  with  equal  force  in  the  case 
of  the  true  values  for  a  theoretically  infinite  population,  from 
which  the  given  group  may  be  assumed  to  be  a  random  sample. 
There  is  only  one  value  of  the  true  tetrad  ratio.  Now  no  matter 
how  the  standard  deviations  and  reliability  coefficients  of  the 
sample  may  differ  from  those  of  the  population,  it  is  these  obtained 
values  which  are  substituted  in  the  formula  and  which  cancel. 
We  see,  therefore,  that  the  sampling  en'or  of  the  tetrad  ratio  de- 
pends only  on  the  errors  in  the  covariances,  no  matter  what  coeffi- 
cients we  use  as  our  basic  data.  This  property  should  be  of  suf- 
ficient importance  to  cause  investigators  hereafter  to  employ  the 
tetrad  ratio  in  preference  to  the  tetrad  difference. 

Experimental  conditions  for  factor-theory  studies.    The 

theory  of  factors  assumes  that  the  correlations  between  tests  are  due 
to  common  underlying  abilities.  It  is  necessary,  therefore,  that  the 
tests  be  given  at  intervals  such  that  the  response  errors  will  not 
introduce  general  or  group  factors.  If  four  tests  of  independent 
abilities  be  given  at  the  same  time,  the  common  response  errors 
will  introduce  a  spurious  general  factor.  It  would  seem,  there- 
fore, that  each  test  would  have  to  be  given  at  a  different  time. 
But  in  extensive  investigations  this  is  rarely  practical.  If  each 
test  has  two  forms,  all  the  Form  A  tests  can  be  applied  at  one 
time  and  all  the  Form  B  tests  at  another.  Then  we  may  cor- 
relate the  Form  A  score  on  any  one  of  them  with  the  Form  B 
score  or  any  other  without  introducing  any  correlated  response 
errors.    The  factor  pattern  may  then  be  written, 

Form  A  Form  B 


Xi=Cig+kiSi+6i.  Xi  =  Cig+kiSi+5i. 

X2  =  C2g+k2S2 +62.  Xii  =  Ciig+kiiS, +5ii. 

X3=C3g+k3S3  +  53.  Xiii=Ciiig  +  kiiiS3+5iii. 

X4  =  C4g  +k4S4  +  64.  Xiv  =  Civg  +  kivS4  +5iv. 


(6) 


The  value  g  is  the  general  factor,  the  s's  are  specific  non-chance 
factors,  and  the  5's  are  errors  of  measurement.  The  S's  of  the 
Form  A  tests  are  not  assumed  to  be  uncorrected  among  them- 
selves, nor  are  those  of  the  Form  B  tests,  but  the  Form  A  5's  are 
assumed  to  be  uncorrelated  with  the  Form  B  S's.  The  c's  and  k's 
are  constants,  and  Ci/Ci  =  k,/ki,  C2/Cii  =  k2/kii,  c^/Ciu  =  ^3/^11,  and 
C4/Civ  =  k4/kiv,  since  the  two  forms  of  each  test  are  supposed  to 
differ  only  in  their  units  of  measurement  and  errors,  but  not  in 
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their  relative  make-up  in  terms  of  underlying  abilities.     Then 
from  this  factor  pattern  we  obtain, 


Pl.ii=CiCu(7g. 

Pi2=Cfi2(rl. 

Pl.iii=CiCiiiO-g. 

Pi3=CiC3(rg. 

Pl.iv  =  CiCivO-g. 

Pi4=CiC4(7g. 

P2.iu=C2Ciii<Tg. 

Pii.3=CiiC3(7g. 

2 
P2,iv  =  C2Civ(rg. 

Pii.4  =  CiiC4ag. 

2 
Ps.iv  =  C3Civ(rg. 

Piii.4  =  CiiiC4(rg 

FYom  these  equations  we  find, 

Pl.iiPi2P3,ivPiii,4  =  Pl,iiiPi3P2,ivPii.4 
=  Pl.ivPi4P2.iiiPii.3  =  CiCiC2CuC3CiiiC4Civ(Tg.  (8) 

Practical  tetrad  relation  for  the  theory  of  two  factors. 
Using  this  relation  we  obtain, 

(Pl.iiPi2P3.ivPiii.4/Pl.iiiPi3P2.ivPu.4)^  =  l.     ) 

(Pl.iiPi2P3.ivPiii.4/Pl.ivPi4P2.iiiPii.3)f  =  1.      ?  (9) 

(Pl.iiiPi3P2.ivPii,4/Pl.ivPi4P2.iiiPii.3)^  =  l.     ) 

Practical  tetrad  ratio  equations. 

There  are  three  more  equations  of  this  sort  which  are  the  recip- 
rocals of  these.  Only  two  are  independent,  as  the  third  will 
always  be  the  quotient  (or  its  reciprocal)  obtained  by  dividing  one 
by  another.  Hence  in  practice  it  is  only  necessary  to  demonstrate 
that  two  of  the  tetrad  ratios  which  are  not  reciprocals,  are  not 
significantly  different  from  unity,  in  order  to  be  able  to  assert 
the  plausibility  of  the  theory  of  two  factors  as  an  explanation  of 
the  data. 

The  correlation  coefficients  may  replace  the  covariances  in  (9) 
without  changing  either  the  values  or  the  standard  errors  of  these 
equations.  If  we  let  the  "true"  scores  underlying  the  two  forms  of 
tests  one,  two,  three,  and  four  be  x„,  x„,  x^,  and  x,  respectively, 
we  may  compute  r^o,,  r„^,  r^,,  r<^,  r„„  and  r^,  by  formula  (3)  of 
Chapter  IV.  If  now,  we  go  back  to  equations  (5)  and  substitute 
these  correlations  corrected  for  attenuation,  we  obtain, 

r„j-^,/r„,r<^  =  l.>  (10) 

Alternate  form  of  the  practical  tetrad  ratio  equations. 
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The  values  yielded  by  (10)  are  identical  with  those  of  (9),  as  the 
denominators  of  all  the  correlations  corrected  for  attenuation  will 
cancel. 

Analysis  of  variance  in  the  theory  of  two  factors.  If  equa- 
tions (9)  or  (10)  have  been  shown  to  hold  for  a  particular  set  of 
data,  we  may  determine  the  relative  contributions  of  the  general  fac- 
tor, the  non-chance  specific  factor,  and  the  error  of  measurement  to 
the  variance  of  any  variable,  provided  the  two  forms  of  that  vari- 
able are  equally  reliable.  Consider  Xj  and  Xi.  The  assumption 
of  equal  reliability  imposes  the  condition  that  Ci/(ri=Cj/cri,  and 
kj/ai  =ki/ai,  since  Cj/Cj  =  ki/ki.  From  the  first  of  these  ratio  equal- 
ities we  obtain  the  important  relation, 

Ci(Tg/<Ti=cf(7g/(rf  =CiCi0rg/«ri<7i.  (11) 

From  the  first  of  equations  (6),  we  find, 

1  1 

=  cM/c^-{-kUl  lel  +  a\  /(T?  =  1.  (12) 

1  i 

Analysis  of  variance  of  either  form  of  variable  one,  when  the 
two  forms  are  equally  reliable,  and  the  theory  of  two  fac- 
tors holds. 


Now  from  (7), 


4 

Pl,iiPi2  —  ClCiC2Cij<rg, 

Pl.iiiPi3=CiCiC3Ciiiffg, 

4 

P2,iiiPii.3  —  C2CiiC3Ciii(rg, 


and, 


(Pl.iiPi2Pl.iuPi3/P2.iuPii.3)^  =  C,Ci<Tg. 

From  this  equation  and  (11), 

Ci<rg/ai  =Ci  ajai  =  (Pi.iiPi2Pi.iiiPi3/P2.iiiPu.3«^i<^i ) 

=  (ri,iiri2ri,iiiri3/r2,iiirii_3)   .  (13) 

Proportion  of  the  variance  of  either  form  of  variable  one  due 
to  the  general  factor,  when  the  two  forms  are  equally 
reliable,  and  the  theory  of  two  factors  holds. 

From  the  definition  of  the  reliability  coefficient  and  the  assump- 
tion that  both  forms  of  the  test  are  equally  reliable,  we  obtain, 
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<r]/c]  =  a]fa^  =  l-Tu.  (14) 

1  i 

Proportion  of  the  variance  of  either  form  of  variable  one  due 
to  the  error  of  measurement,  when  these  forms  are  equally 
reliable. 

This  result  is  independent  of  the  assumption  that  the  theory  of 
two  factors  holds.    Finally, 

k2    2    /    2       1,2    2    /    2 
1  1 

=  rii  -  (ri.iiri2ri.iiiri3/r2.iiirii,3)^.  (15) 

Proportion  of  the  variance  of  either  form  of  variable  one  due 
to  the  non-chance  specific  factor,  when  the  two  forms  are 
equally  reliable,  and  the  theory  of  two  factors  holds. 

An  analysis  similar  to  the  above  could  be  made  for  variable  one, 
taking  two  and  four  or  three  and  four  as  the  reference  variables, 
instead  of  two  and  three.  The  most  accurate  analysis  would  be 
made  by  using  each  of  the  three  sets  of  reference  variables  in  turn, 
and  averaging  the  results.  An  analysis  of  the  variances  of  two 
equally  reliable  forms  of  any  of  the  other  three  tests  could  be 
made  in  the  same  manner. 

The  square  root  of  any  of  these  proportional  contributions  will 
be  equal  to  the  correlation  between  the  contributing  factor  and 
the  total  variable.  The  proof  in  each  case  is  similar  to  the  proof 
that  the  square  root  of  the  reliability  coefficient  is  equal  to  the 
correlation  between  the  score  and  the  underlying  ability,  as  given 
in  Chapter  II  for  equation  (12)  of  that  Chapter. 

The  group-factor  theory.  According  to  this  hypothesis,  the 
correlations  between  tests  are  to  be  explained  on  the  basis  not  only 
of  one  general  factor  and  a  specific  factor  in  each  variable,  but  in 
addition,  of  one  or  more  independent  group  factors,  which  are 
common  to  some  but  not  all  of  the  observed  variables.  In  the 
simplest  case  involving  four  tests,  we  shall  assume  that  in  addi- 
tion to  the  general  factor  and  the  specific  factors,  there  is  a  group 
factor  common  to  two  of  the  variables.  We  then  have  the  fol- 
lowing factor  pattern, 
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Form  A  Form  B 

Xi=Cig+(l,a+kiSi+6i.  Xi  =  Cig+dia  +kiSi+5i.    \ 

X2=C2g+d2a+k2S2+52-  Xii=Ciig+diia+kiiS2+6ii.   {(^iq) 

X3=C3g  4-k3S3+53.  Xiii  =  Ciug         +kiiiS3+3iii.  r 

X4=C4g  +k4S4  +  54.  Xiv=Civg  +kivS4  +  5iv.  ; 

The  value  g  is  the  general  factor,  a  is  the  group  factor,  the  s's  are 
specific  non-chance  factors,  and  the  S's  are  errors  of  measurement. 
We  shall  assume  as  before  that  all  the  different  factors  are  uncor- 
related  except  the  5's  of  Form  A  and  the  S's  of  Form  B,  which  may- 
be correlated  respectively  among  themselves  but  not  with  each 
other  or  with  any  of  the  other  factors.  The  c's,  d's  and  k's  are 
constants.  It  is  to  be  noted  that  Ci/Ci  =  di/di  =  ki/ki,  etc.,  since 
the  two  forms  of  any  test  are  supposed  to  differ  only  in  their  units 
of  measurement  and  errors,  and  not  in  their  relative  proportions 
of  underlying  abilities.    We  may  then  write. 


(17) 


Pi.ii  =CiCii(Tg+didi 

2 

Pi2=CiC2(Tg^ 

2 
Pl.iii^CiCjiiffg. 

Pi3=CiC3(rg. 

2 
Pl.iv^CjCivffg. 

Pi4  =  CiC4<rg. 

P2,iii  =  C2CiuO-g. 

Pii.3=CuC30-g. 

P2,iv  =  C2CivO-g. 

Pii.4=CuC40-g. 

P3.iv  =  C3Civ<rg. 

2 
Piii,4  —  CuiC40'g 

From  these  relations  we  obtain, 

g 

Pl.iiiPi3P2.ivPii.4  =  Pl.ivPi4P2.iiiPii.3  =  CiCiC2CiiC3CiiiC4CivO-g. 

Pi.iiPi2P3.ivPiii.4  =  CiCiC2CiiC3CiiiC4Civcrg4-terms  containing  o-g. 
Tetrad  relation  for  the  single  group-factor  theory. 


(18) 


If  the  group  factor  had  been  in  X3,  Xm,  X4,  and  x^y  instead  of  in 
Xi,  Xj,  X2,  and  Xjj,  the  last  pair  of  equations  (17)  would  have  con- 
contained  the  <jI  term  instead  of  the  first  pair.  In  this  case,  how- 
ever, (18)  would  remain  unchanged.  Hence  we  see  that  any  fac- 
tor-pattern tests  derived  from  (18)  can  only  establish  the  fact 
that  a  group  factor  exists  either  in  variables  one  and  two,  or  in 
three  and  four,  or  in  each  of  these  pairs.    From  (18), 

(Pl.iiPi2P3.ivPiii,4/Pl.iiiPi3P2.ivPii.4)^    ^  \ 

=  (Pl.iiPi2P3.ivPiii.4/Pl,ivPi4P2.iiiPii_.3)^7^1,     / 
(Pl.iiiPi3P2.ivPii.4/Pl,ivPi4P2.iiiPu.3)^  =  l,  or     >  (19) 

^coy^uiTjI^oiij^uiy        -^'  / 

Tetrad  ratio  equations  for  the  single  group-factor  theory. 
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If  we  divide  the  second  tetrad  ratio  of  (19)  by  the  first,  their 
quotient  will  be  the  third.  Hence  the  substantial  equality  of  the 
first  two  is  demonstrated  at  once  when  the  third  is  shown  to  be 
equal  (within  its  sampling  error)  to  unity.  The  covariances  of 
the  two  pairs  of  variables  in  either  or  both  of  which  the  group 
factor  may  lie,  are  found  in  the  numerators  of  the  fractions  which 
are  not  equal  to  unity.  It  is  possible  to  form  three  more  tetrad 
ratio  equations  which  will  be  the  reciprocals  of  those  given  in 
(19),  but  these  are  not  needed  in  any  further  analysis. 

Analysis  of  variance  in  the  single  group-factor  theory. 

The  demonstration  that  in  a  set  of  three  tetrad  ratios,  one  is  sub- 
stantially equal  to  unity  and  one  other  is  not,  is  sufficient  to 
establish  the  fact  that  the  theory  of  two  factors  does  not  hold, 
that  the  single  group-factor  theory  may,  and  that  if  it  does,  the 
group  factor  lies  in  one  or  the  other  or  both  of  two  particular  pairs 
of  variables.  But  that  is  all  it  does  establish.  If  we  take  each  of 
the  pairs  of  variables  suspected  of  containing  a  group  factor  with 
other  pairs,  we  may  be  able  to  discover  in  which  pair  it  actually 
resides.  Sometimes  such  an  analysis  is  unnecessary,  as  the  vari- 
ables containing  the  group  factor  can  be  picked  out  simply  by  an 
examination  of  the  nature  of  the  tests.  If  we  know  that  of  four 
variables,  only  one  particular  pair  contains  a  group  factor,  it  is 
possible  to  effect  a  partial  analysis  of  variance  of  the  different 
variables.  Consider  variable  one  again,  when  the  factor  pattern 
is  known  to  be  that  given  in  (16).  We  must  assume  as  before 
that  the  two  forms  of  test  one  are  equally  reliable,  so  that 
Ci/tri=Ci/<ri,  and  equation  (11)  still  applies.  From  the  first  of 
equations  (16), 

cWj'ri+dWjal-^klal  lc\^a\  lc\ 

1  1 

=  cWjai+diallal-\-\Lhl  /<t^-{-<t]  M?  =  1.  (20) 

1  i 

Analysis  of  variance  of  either  form  of  variable  one,  when 
these  forms  are  equally  reliable,  and  the  single  group- 
factor  theory  holds. 


Now  from  (17), 


_  4 

Pl.iiiPi3  —  CiCiC3Ciii(rg, 

Pl.ivPi4=CiCiC4CivaJ, 

4 
P3,ivPiii,4  ~  C3CiiiC4CivO'g, 
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and, 

(Pl.iuPi3Pl.ivPi4/P3.ivPiii.4)    '  =CiCi(rg. 

From  this  equation  and  (11), 

Cl<rg/<r?  =C?crg/(Tf  =  (Pi,iiiPi3Pi.ivPi4/P3.ivPiii.4<^l<^?)^ 

=  (ri.iiiri3ri.ivri4/r3,ivriii,4)  "•  (21) 

Proportion  of  the  variance  of  either  form  of  variable  one  due 
to  the  general  factor,  when  there  is  a  single  group  factor 
in  variables  one  and  two,  and  the  two  forms  of  test  one 
are  equally  reliable. 

This  equation  is  similar  to  (13),  but  subject  to  the  limitation 
that  the  reference  variables  must  be  those  which  do  not  contain 
the  group  factor.  Equation  (14)  still  applies,  giving  the  propor- 
tion of  the  variance  due  to  errors  of  measurement.  This  is  as  far 
as  it  is  possible  to  proceed  with  the  analysis  when  there  are  only 
four  variables.  The  difference  between  the  proportion  of  the 
variance  due  to  the  general  factor  and  the  proportion  due  to  the 
error  of  measurement  will  equal  the  proportion  due  to  the  group 
factor  and  the  non-chance  specific  factor  taken  together.  These 
last  two  proportions  cannot  be  separated  on  the  basis  of  data 
from  four  variables. 

The  variance  of  variable  two  can  be  partially  analyzed  in  sim- 
ilar fashion.  Those  of  variables  three  and  four  can  be  analyzed 
completely,  as  described  for  the  case  where  the  theory  of  two  fac- 
tors holds.  Variables  one  and  two  should  not  be  taken  together 
as  the  reference  variables.  Hence,  in  analyzing  the  variance  of 
variable  three  we  could  take  one  and  four  or  two  and  four;  and  in 
analyzing  variable  four,  one  and  three  or  two  and  three,  as  the 
reference  variables,  but  in  neither  case  could  we  take  one  and  two. 

Analysis  of  covariance.  Consider  the  covariance  of  Xi  and 
Xii,  and  of  Xi  and  Xz.    From  (17), 

Pl.iiiPii.3=CiCiiC3Ciii(rJ, 

4 
Pl.ivPii.i  —  CiCiiC4Civ<rg, 

4 
P3.ivPiii,4  —  C3CiiiC4Civ(rg, 

and, 

(Pi.iiiPii.3Pi.ivPii.4/P3.ivPm.4)^  =  CiCii(T^  =  7i.ii,  say.  (22) 

Amount  of  the  covariance  of  Xi  and  Xji  due  to  the  gen- 
eral factor. 
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Pi3P2.iii=CiC2C3Ciii(Tj, 

Pi4P2.iv=CiC2C4Civ<rJ, 
4 
P3.ivPiii.4  =  C3CiijC4Civ(rg, 


and, 

(Pi3P2.iiiPi4P2.iv/P3.iv?iii.4)^  =  CiCjCTg  =  7i.2.  (23) 

Amount  of  the  covariance  of  Xj  and  X2  due  to  the  general 
factor. 

Then  from  the  first  of  equations  (17), 

didii(ra  =  pi.H-7i.ii  =  ai,ii,  say.  (24) 

Amount  of  the  covariance  of  Xj  and  x^  due  to  the  group  factor. 

Also, 

didiffl  =  Pi2  -  7i.2  =  «i.2-  (25) 

Amount  of  covariance  of  X;  and  X2  due  to  the  group  factor. 
All  the  other  covariances,  as  may  be  seen  from  (17),  are  due 
entirely  to  the  general  factor.* 

Analysis  of  variance  with  five  variables.  If  we  have  five 
variables  with  a  group  factor  running  through  three  of  them,  it 
is  possible  to  make  a  complete  analysis  of  variance.  The  pres- 
ence of  such  a  group  factor  may  be  established  by  equations  (19), 
taking  each  of  the  three  possible  pairs  of  variables  suspected  of 
containing  it,  with  a  number  of  pairs  of  other  variables  which 
among  themselves  conform  to  the  theory  of  two  factors.  The 
factor  pattern  will  be  exactly  like  (16),  with  the  single  additional 
pair  of  relations, 

X5=C5g+d5a+ksSs+55.        Xv  =  Cvg+dva+kvS5+5v- 

Then  in  addition  to  (17)  we  will  have, 

Pl.v  =  CiCv(Tg  +  didv(Ta-  Pi5  =CiCs<rg+dids<ra. 

P2v  =C2Cv(Tg  +  d2dv<ra-  Pii.S  =CiiC5crg  +  diidsO-a 


'a* 
Psv  =  CjCvCTg.  Piii.5  =  CiiiCscrJ. 

P4v  =  C4Cv<rg.  Piv.S  =  CivCs  0-g. 


(26) 


By  an  argument  similar  to  that  leading  to  (22),  (23),  (24),  and 
(25),  we  have, 

•  I  am  indebted  to  my  colleagues,  Mr.  Jack  W.  Dunlap  and  Dr.  Irving  Lorge, 
for  suggesting  the  above  analysis  of  covariance,  and  the  analysis  of  variance  in 
the  case  of  five  variables,  which  follows. 
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(Pl,iiiP3vPl.ivP4v/P3.ivPiii.4)^=CiCvO-g  =  7l.v 
(Pi3Piii.5Pi4Piv.5/P3,ivPiii.4)^  =CiC5(Tg  =7i.5. 

(P2.mP3vP2.ivP4v/P3,ivPiii.4)^  =  C2Cv<Tg  =  72.v. 
(Pii,3Piii.sPii,4Piv.s/P3.ivPiii.4)    '  =CiiCsO-g=7ii.5. 

Then, 

dldv(^a=Plv-7l.v=«l.v.  (27) 

did5(ra=Pi.5-7i.5  =  «i.5.  (28) 

d2dv(Ta=P2,v-72.v  =  «2.v.  (29) 

diidjCTa  =  Pii.5  -  7ii.5  =  aii.5-  (30) 

The  six  equations,  (24),  (25),  (27),  (28),  (29)  and  (30),  form  a  sys- 
tem which  may  be  solved  for  d^dicrl,  d2diicra>  3,nd  dsd^al.  If  we  as- 
sume that  the  two  forms  of  each  of  the  tests  are  equally  reliable,  we 
find  that  Ci/ci  =Ci/ai,  di/trj  =di/(ri,ki/o-i  =ki/(7j,  and  that  similar 
relations  hold  for  variables  two  and  five,  since  Ci/Ci  =  di/di  =  ki/ki, 
etc.  PYom  the  first  of  these  ratio  equations  we  obtain  equation 
(11)  again,  and  from  the  second, 

dWjal  =  dyM=did-yM<ri.  (31) 

From  (24),  (25),  (27),  (28),  (29),  and  (30), 

(«i.uai.2ai.vai.5/a2.v«ii.5)^  =  didiO-^, 
(«i.ii«i.2a2,vaii.s/ai.v«i.5)^  =  d2diiO-a, 

(«l.vai.5«2.vaii.s/«l.ii«i.2)  ^  =  dsdyO-a, 

and  from  (31), 

d](rJ(Ti  =  dlalJcr^  =  didiO-^(Ti«ri 
=  («i.uai.2ai.vai.5/a2.v«u.5<^i«f)^  (32) 

Proportion  of  the  variance  of  either  form  of  variable  one  due 
to  the  group  factor,  when  there  is  a  single  group  factor  in 
variables  one,  two,  and  five,  and  the  two  forms  of  each  of 
these  tests  are  equally  reliable. 

Similar  equations  can  of  course  be  written  for  variables  two  and 
five.  Equations  (14)  and  (21)  give  the  proportions  of  the  vari- 
ance of  variable  one  due  to  the  chance  factor  and  the  general  fac- 
tor respectively,  as  before.    We  have  finally, 

ki^g  /(Tj  =ki(rg  /o-;  =kikiO-s  /o-,ai 
1  1  1 

=  rii  —  (ri,ijiri3ri^ivri4/r3,ivriii^4) 

—  («i.ii«i.2«i.v«i.5/a2,vaii.s<^i<^f)^-  (33) 
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Proportion  of  the  variance  of  either  form  of  variable  one  due 
to  the  group  factor,  when  there  is  a  single  group  factor  in 
variables  one,  two,  and  five,  and  the  two  forms  of  each  of 
these  tests  are  equally  reliable. 

Analysis  of  *'true"  variance.  The  whole  problem  of  the 
analysis  of  variance  may  be  approached  in  another  manner  by 
taking  the  estimated  "true"  variance  as  unity  instead  of  the 
obtained  variance.  If  the  two  forms  of  a  test  are  equally  reliable, 
we  have  from  the  definition  of  the  reliability  coefficient, 

ei<Tl=<jlTn.  (34) 

Estimated  "true"  variance  of  variable  one,  measured  in  the 
units  of  Xi- 

The  constant  c]  indicates  merely  that  ai  is  measured  in  the  units 
of  Form  A  of  variable  one.  An  alternative  analysis  of  the  variance 
of  Xi  or  Xi,  when  these  are  equally  reliable,  may  be  made  by  analyz- 
ing the  variance  of  x<„  and  multiplying  the  values  so  obtained  by 
rij.  In  the  more  complicated  cases  involving  five  variables,  this 
method  of  analysis  is  simpler  than  the  one  previously  outlined. 

Let  the  portions  of  variables  one,  two,  three,  and  four  which  are 
not  chance  be  denoted  Xa,,  x„,  Xy,  and  x,.  The  factor  pattern  may 
then  be  written, 

X„=Cig+Si. 

x„  =  C2g+S2.   (^  (35) 

X,  =  C3g+S3. 
X,=C4g+S4. 

We  then  define  the  values, 

ai=Cj(rg/<r„.  \ 

«.3  =  C3<^g/ff7-    ( 
a4  =  C4(rg/(r,.   / 

From  (35)  and  (36)  we  find, 

T^y  =  aia3.  r„,  =  a!2«4-  /  (37) 

roo,  =  ofia4-        r7'j  =  "3«4-  / 
Then, 

rooJ*^,  =  ra,Yr„,=r„,r<^  =  aia2Q!3a4.  (38) 
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This  is  the  tetrad  relation,  and  it  has  already  been  proved  that 
the  tetrad  ratio  in  this  case  is  the  same  as  that  obtained  from  the 
covariances  or  correlation  coefficients. 

The  analysis  of  variance  in  this  situation  is  comparatively  sim- 
ple.   From  (36)  and  (37), 


2    2/     2  2 _       /_ 


—  ^aaiA'Olll^UJI         ^coy^oajil^yri' 


(39) 


Proportion  of  the  "true"  variance  of  variable  one  due  to  the 
general  factor,  when  the  theory  of  two  factors  holds. 

This  formula  has  been  used  by  Cureton  and  Dunlap  (1930),  and 
called  the  triad  by  them. 


/  ^^  00         -*-        *  oocij    <^yl  ^  I 


-»-         ^cDor-corjI^ojT)         -*■         *^  CO  y '-^  CO  rj  I  ^yij* 


(40) 


Proportion  of  the  "true"  variance  of  variable  one  due  to  the 
specific  factor,  when  the  theory  of  two  factors  holds. 

Note  that  the  term,  "specific  factor"  as  used  here,  means  the  non- 
chance  specific  factor.  The  error  of  measurement  or  chance  spe- 
cific factor  is  eliminated  from  consideration  entirely  by  the  defi- 
nition of  the  "true"  variance  and  the  use  of  correlations  corrected 
for  attenuation. 

For  the  single  group-factor  theory  we  have  the  factor  pattern, 


If  we  define. 


we  obtain, 


Xco=Cig+dia-HSi. 
Xa,  =  C2g+d2a+S2. 

X^=C3g  +S3. 

X,=C4g  +S4. 

Xe  =  C5g+dsa+S5. 
/3i=di(ra/(roo, 

^2  =  d2<ra/<Ta,> 

^5=dso-a/<re, 


(41) 


roou,  =  aia2  +  /3i/32. 

r„,  =  aia4. 

r„e  =  aiQ;s  +  /3i/3s. 

^wy   =OC20C3- 


r„,  =  a2a4- 

r^  =  a2as  +  /32/35- 

Tyr,  =  a3a4. 

YyQ  =  030:5. 


(42) 
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Then, 

cWg/al  =  a?  =  r„^r„  ,/r^,.  (43) 

Proportion  of  the  "true"  variance  of  variable  one  due  to  the 
general  factor,  when  the  single  group-factor  theory  holds. 

This  is  the  particular  one  of  equations  (39)  which  does  not  take 
x„  as  one  of  the  reference  variables. 

Analyzing  now  the  correlation  roo„, 

ai«2=ra>T,r^^/r^,.  (44) 

Amount  of  the  "true"  correlation  between  variables  one  and 
two  due  to  the  general  factor. 

/3i  ^2  =  r»a.  —  Ta^yT^jTy^.  (45) 

Amount  of  the  "true"  correlation  between  variables  one  and 
two  due  to  the  group  factor,  when  the  single  group-factor 
theory  holds. 

By  a  similar  line  of  reasoning, 
and, 

From  these  equations  and  (45), 

dWjal  =  /3?  =  (0i02XfiMI(^205).  (46) 

Proportion  of  the  "true"  variance  of  variable  one  due  to  the 
group  factor,  when  the  single  group-factor  theory  holds. 

Finally, 

alla'^=l-al-fil  (47) 

1 

Proportion  of  the  "true"  variance  of  variable  one  due  to  the 
specific  factor,  when  the  single  group-factor  theory  holds. 

Variables  two  and  five  can  be  analyzed  in  similar  fashion,  and 
variables  three  and  four  can  be  analyzed  as  was  variable  one  for 
the  two-factor-theory  case,  taking  care  not  to  use  variables  one 
and  two  or  one  and  five  or  two  and  five  as  the  reference  variables. 


CHAPTER  VI 

Summary  of  Important  Formulas,  with  their  Standard 

Errors 

Assumptions  involved  in  standard  error  derivations.  The 

principal  assumptions  involved  in  the  derivation  of  the  standard 
errors  given  hereafter  are: 

1.  That  all  samples  are  drawn  from  populations  normally  dis- 
tributed with  respect  to  all  the  variables  measured. 

2.  That  all  samples  are  drawn  from  populations  in  which  the 
regressions  of  all  the  variables  on  one  another  are  linear. 

3.  That  all  samples  are  sufficiently  large  so  that  higher  powers 
of  the  sampling  errors  are  small  in  comparison  with  first  powers, 
and  may  be  neglected. 

The  sampling  errors  of  correlation  functions  depend  essentially 
on  the  simultaneous  error-distribution  of  the  variances  and  co- 
variances  of  the  system.  This  distribution  has  been  determined 
by  Wishart  (1928),  who  provides  a  table  of  its  moments  up  to 
the  eighth  order  and  four  variables,  in  terms  of  the  variances 
and  correlations  of  the  sampled  population.  Since  we  have 
assumed  a  large  sample,  we  may  replace  population  values  by 
sample  values.  Furthermore,  we  may  replace  the  value  (N  —  1)/N^ 
by  1/N,  N  being  the  total  frequency  of  the  sample.  Then,  ex- 
pressing our  results  in  terms  of  variances  and  covariances 
instead  of  in  terms  of  variances  and  correlation  coefficients,  we 
have  from  Wishart's  table  of  moments, 


NP,\,2 
1     2 

=  2p,i 

No^p   = 

12 

t^lV2HPlV 

NP,2p 

1      12 

=  2<r,2p,2. 

NP,2p 

1     23 

=  2p,2Pi3. 

NP 

-LNiTp      p 
12   13 

=  ffl^P23+Pl2Pl3 

NPp  p 

12  34 

=  Pl3P24+Pl4P23 

(1) 


The  large  0^  and  the  capital  P  represent  the  sampling  variance 
and  covariance  respectively. 


AND  CORRELATION  53 

Most  of  the  important  functions  discussed  in  the  previous  chap- 
ters are  in  a  form  consisting  of  a  series  of  products  and  quotients 
of  variances  and  covariances.  With  the  assumptions  stated 
above,  the  standard  error  of  any  such  function  may  be  determined 
to  a  first  approximation  by  taking  its  diflferential  or  logarithmic 
differential,  squaring,  summing  for  all  samples,  dividing  by  the 
number  of  samples,  and  substituting  the  values  of  sampling  vari- 
ances and  covariances  from  (1). 

There  is  one  other  approximation  that  enters  into  certain  form- 
ulas. If  we  know  that  the  sampling  error  is  F^,  say,  is  A,  we  may 
wish  to  know  the  corresponding  error  in  F.  Now  (F^-fA)^  = 
F+A/2F-aV8F^+  .  .,  and  if  F  is  large  as  compared  to  A,  the 
term  A^/8F^,  together  with  all  subsequent  terms  in  the  expansion, 
will  be  negligible  in  comparison  with  A/2F.  Then  if  8  is  the  error 
in  F,  5  =  A/2F,  to  a  first  approximation. 

Notation.  In  order  to  facilitate  the  work  of  derivation,  a  new 
system  of  notation  has  been  introduced.  Its  relation  to  the  system 
used  in  the  previous  chapters  will  be  apparent  at  once  from  the 
table  following. 


Form  A 

'Tew 

Old 

1 

1 

4 

2 

5 

3 

8 

4 

9 

5 

12 

6 

Form  B 

"Tew 

Old 

3 

2 

•  • 

11 

7 

•  •  • 

111 

6 

iv 

11 

V 

10 

vi 

This  system  was  designed  so  that  a  series  of  products  such  as 
Pi.iiPi2P3.ivPiii.4  •  •  •  could  be  represented  by  Pi2P34P56P78  •  •  • 

In  the  following  paragraphs,  the  more  important  formulas  of 
the  previous  chapters  are  given  again  in  the  new  notation.  In 
many  cases  two  such  formulas  are  of  the  same  form  algebraically. 
In  each  case  the  original  uses  of  the  formula  are  given,  together 
with  the  chapter  and  formula  numbers  under  which  it  has  previ- 
ously appeared.  Its  standard  error,  to  the  degree  of  approxima- 
tion stated  above,  is  presented.  The  algebra  involved  in  the  deri- 
vation of  these  standard  errors  is  too  lengthy  to  be  included  here.* 

*  A  copy  of  these  derivations  is  on  file  at  the  Library  of  Teachers  College, 
Columbia  University. 
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1. 

F=rg.  (2) 

Index  of  reliability  determined  from  two  equally  reliable 
forms  (II,  (12)  and  (10)). 

<rF  =  (l-r,'3)/2(r,3N)^.  (3) 

2. 

F  =  nr,3/(l  +  (n-l)r,3).  (4) 

Spearman-Brown  formula  (II,  (28)). 

ap  =  (n  -  nr/3)/[N^(l  +  (n  - 1)1,^)'].  (5) 

This  formula  was  first  given  by  Shen  (1924).    If  n  =  2,  we  have 
the  special  case, 

F'  =  2r,3/(l+r,3).  (6) 

Reliability  of  the  sum  of  two  comparable  forms  of  a  test 
(11,  (15)). 

aF,  =  (2-2F')/N^.  (7) 


3. 


4. 


5. 


F  =  a,-C3.  (8) 

Partial  check  on  the  comparability  of  two  forms  of  a  test. 
See  discussion  immediately  preceding  II,  (27). 

<rF  =  (^5+<^|-2r,V,(73)^/(2N)^.  (9) 

F  =  P:2-Pi3  (10) 

F'=P,2-P34  (11) 

Test  for  equality  of  basic  units  of  measurement  of  three  or 
more  forms  of  a  test.  See  discussion  immediately  preced- 
ing II,  (22). 

Check  on  assumption  that  errors  of  measurement  are  uncor- 
related,  applicable  when  the  two  forms  of  each  test  are 
known  to  measure  in  the  same  basic  units  (IV,  (13)). 

<^F  =  (cr?(<xi+a|-2p23)+pf2+Pu-2pi2Pi3)^/N^.  (12) 

aF'=((TlcrH<T3<^4+Pl'2+p/4-2pi3P24-2pi4P23)^/N^.  (13) 

F  =  <Tp     =(<r?(r|+p,'2)^/N^.  (14) 

12 

Test  for  equality  of  basic  units  of  measurement  of  several 
forms  of  a  test  (at  least  6  or  7).  See  discussion  immedi- 
ately preceding  II,  (22). 


AND  CORRELATION  66 

In  this  formula,  p^  is  the  average  of  all  the  covariances  of  the 
several  forms,  and  ctj  and  (T2  are  the  lower  and  upper  quartile 
values  of  the  distribution  of  obtained  standard  deviations.  The 
value  of  F  is  to  be  compared  with  o-p   ,  the  standard  deviation  of 

jk 

the  distribution  of  observed  covariances.  If  a  value  of  F  as  great 
or  greater  than  a^     could  reasonably  arise  by  chance  (as  judged 

jk 

by  (Tp),  then  the  several  forms  may  be  assumed  all  to  be  measuring 
in  the  same  units. 

6. 

F  =  ri2ri3/r23  =  Pl2Pl3/P23<^l-  (16) 

Reliability  of  Form  A  determined  from  a  knowledge  of  three 
tests  of  the  same  ability  (II,  (17),  (18),  (19)). 

Coefficient  of  practical  validity  (III,  (6)). 

Square  of  correlation  between  test  1  and  g,  or  proportion  of 
variance  of  test  one  due  to  g.  See  Spearman  (1927), 
Appendix  p.  xvi,  Kelley  (1928),  p.  41,  and  Dunlap  (1931). 

crp  =  (F/N^)  (l/r,^2 +l/r/3 +l/rl3  +4F+2/F 

-2ri3/r,2r23-2ri2/ri3r23-5)^.  (17) 

A  formula  for  the  standard  error  of  this  function  was  given  by 
Kelley  (1928),  pp.  40-41,  based  on  the  fundamental  formulas  of 
Filon  and  Pearson  (1898).  In  the  form  there  given  it  is  much 
longer  than  (17). 


7. 


F  =  (ri2ri4r23r34)^/(ri3r24)^  =  (Pl2Pl4P23P34)^/(Pl3Pl4)^.  (18) 

Coefficient  of  fundamental  validity  (III,  (4)). 

Correlation  corrected  for  attenuation,  when  all  the  errors  of 
measurement  are  uncorrected  (IV,  (5)). 

<Tp  =  (F/4N^)  [l/r,^2 + l/r,\ + l/r2'3 + l/r34 +4/r,^3  +4/r2\ 
+2(ri3r24+ri4r23)/ri2r34+2(ri2r34+ri3r24)/ri4r23 
+8(ri2r34+ri4r23)/ri3r24+2(ri3/ri2r23+ri3/ri4r34 

+r24/ri2ri4+r24/r23r34)  -4(rj2/ri3r23  +ri2/ri4r24 

+ri4/ri2r24  +ri4/ri3r34  +r23/ri2ri3  +r23/r24r34 

+r34/ri3r,4 +r34/r23r24)  - 12]^.  (19) 
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A  simpler  formula  for  the  standard  error  of  this  function  was  given 
by  Kelley  (1923),  p.  210,  based  on  the  additional  assumption  that 
ri2  =  r,4=r23  =  r34  =  r,  say.  This  assumption  is  justified  whenever 
it  can  be  shown  that  the  two  forms  of  each  test  are  equally  reliable. 
In  this  case, 

F'  =  r/(r,3r24)^  (20) 

ap  <  =  (F72N^)  {1/vls  +  l/r|4  +  1/r-  +  4F'^  +  1/F'^ 

+r,3/r' +r24/r'  -  4/r,3  -  4/r24  -  2)  ^.  (21) 

This  formula  is  identical  with  Kelley's.  In  computation,  r  is  to 
be  taken  as  (ri2ri4r23r34)^. 


8. 


9. 


F  =  ri2r34/ri3r24  =  P12P34/PUP24.  (22) 

Tetrad  ratio  (V,  (5)). 

Partial  check  for  equivalence  of  tests  and  criterion  measures 
(III,  (5),  letting  subscripts  1,  2,  i,  ii  =  1, 2, 3, 4,  respectively). 

Check  for  the  assumption  that  response  errors  are  uncorre- 
cted (IV,  (14)). 

crp  =  (F/N^)[l/r,\+l/r3Ul/riUl/r2U2(l/F+F 
+ri4r23/ri2r34 +ri4r23/ri3r24  —  r23/ri2ri3 
-r23/r24r34-ru/ri2r24-ri4/ri3r34)  -4]^.  (23) 


F  =  (ri2r34/ri3r24)^  =  (Pl2P34/Pl3P24)^.  (24) 

Correlation  corrected  for  attenuation  (IV,  (3)  and  (4),  the 
latter  by  letting  the  subscripts  1,  2,  i,  ii,  equal  1,  2,  3,  4, 
respectively) . 

ap  =  (F/2N^^)[l/r,^2+l/r3\+l/r,^3+l/r2'4+2(l/F^+F2 
+ri4r23/ri2r34+ri4r23/ri3r24  — r23/ri2ri3 
-r23/r24r34-ri4/ri2r24-ri4/ri3r34)  -4]^.  (25) 

If  the  two  forms  of  each  test  are  equally  reliable,  we  may  assume 
that  ri2  =r34  =  r,  and  ri4  =r23  =  r'.    In  this  case, 

F'  =  r/(ri3r24)^.  (26) 

aF<  =  (F72N^)[l/ri23+l/r2'4+2(F'2+l/F'^  +  l/r^ 

+t"It'  +r' Vr,3r24  -  2r7ri3  -  2r7r24)  -  4.f\  (27) 

The  values  of  r  and  r'  for  purposes  of  computation  should  be  taken 
as  (ri2r34)^  and  (ri4r23)^  respectively. 
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10.  The  standard  errors  of  many  formulas  depend  upon  the 
sampHng  variances  and  covariances  of  correlations  corrected  for 
attenuation.  The  sampling  variance  is  simply  the  square  of  the 
standard  error,  which  may  be  obtained  from  (25)  or  (27). 

rooo)  =  (ri2r34/ri3r24)  '. 
Tcoy  =  (fi 7^35/^13^57)  '. 

^Y!;  =  (^56^78/^57^68)    '' 

+  (l/ri2r35)  (ri3r2s  +ri5r23)  +  (l/ri3r24)  (ri2r34 +ri4r23) 
+  (l/ri3rs7)  (ri5r37 +ri7r35)  +  (l/r24rs7)  (r25r47  +r27r45) 

-  (l/ri7r24)  (ri2r47  +ri4r27)  -  (l/r24r35)  (r23r45  +r25r34) 

-  (l/ri2r57)(ri5r27+ri7r25)  -  (l/r34r57)(r35r47+r37r45) 
+r27/ri2ri7+r45/r34r3s  -r37/ri3ri7  -ri5/ri3r3s 

-  r23/ri2ri3  -  ri4/ri3r34 + l/ri'3  - 1].  (28) 

Pr     r      = 

(r„  j-^,/4N)  [  (l/ri2r56)  (ri5r26  +r,6r25)  +  (l/r34r56)  (r35r46 +r36r45) 
+  (l/ri2r78)  (ri7r28  +ri8r27)  +  (l/r34r78)  (r37r48  +r38r47) 
+  (l/ri3r57)  (ri5r37  +r,7r35)  +  (l/r24r57)  (r25r47  +r27r45) 
+  (l/ri3r68)  (ri6r38  +ri8r36)  +  (l/r24r68)  (r26r48  +r28r46) 

-  (l/ri3r56)(ri5r36+ri6r3s)  -  (l/r24rs6)(r25r46+r26r45) 

-  (l/ri3r78)(ri7r38+ri8r37)  -  (l/r24r78)(r27r48+r28r47) 

-  (l/ri2r57)  (risr27  +ri7r2s)  -  (l/r34r57)  (r35r47  +r37r4s) 

-  (l/ri2r68)  (ri6r28 +ri8r26)  -  (l/r34r68)  (r36r48 +r38r46)l.        (29) 

If  the  two  forms  of  each  test  are  equally  reliable,  we  may  assume 
that  ri2=r34,  ri4  =  r23,  ri7  =  r35,  ris=r37,  ri6=r38,  ri8  =  r36,  r47=r25, 
r45=r27,  r46  =  r28,  r48  =  r26,  r56  =  r78,  and  rs8  =  r67.  The  reliability 
coefficients,  ri3,  r24,  r57,  and  r68  will  all  be  different.  Designating 
either  of  the  correlations  of  an  equal  pair  by  the  subscripts  of  the 
first,  we  have, 

P'r    r     =(ra,„ra>^/4N)[(2/r,2ri7)(ri3r47+ri4r,s) 

00(4)   co-y 

+  (l/ri3r24)  (r/2  +ri'4) + (l/r^rs?)  (r  I's  +ri'7) + (l/r24r57)  (r4's  +r4'7) 

-  (2/ri7r24)  (ri2r47  +ri4r4s)  -  (2/ri2r57)  (risr4s  +r,7r47) 
+2r45/ri2ri7  -2ri5/ri3ri7  -2ri4/r,2ri3+l/ri^3  -1].  (30) 


58  ERRORS  OF  MEASUREMENT 

F\    r    =  (r<=<^^,/4N)[(2/r,2r56)(r,5r48+ri6r47+ri7r46+r,8r45) 
"■- '"  +  (l/r,3r57)  {rls  +r^')  +  (l/r24r57)  iu',  +r4^5) 
+  (l/r.ares)  (r,'6 +rh)  +  (l/r24r68)  (r/s  +r4'6) 

-  (2/ri3r56)(r,5ri8+r,6r,7)  -  (2/r24r56)(r45r78+r46r47) 

-  (2/ri2r57)(r,5r45+ri7r47)  -  (2/ri2r68)(ri6r46+ri8r48)].       (31) 

For  purposes  of  computation  with  formulas  (30)  and  (31),  the 
values  of  r,2,  r^,  .  .  .  should  be  taken  as  (ri2r34)^,  (ri4r23)^,  etc. 

11. 

F  =  (Pl2P34P56P78/Pl7P35P46P28)^ 
=  (ri2r34r56r78/ri 7^35^46^28) 
=  T„JyjTa>yr^r,-  (32) 

Practical  tetrad  ratio  (V,  (9),  (10),  (19)). 
aF  =  F(ar   /ri^+ffr  /r^^+o-r    /r.^+o-f   /r„, 

oDo)  yr)  <»y  wq 

"r^-^  r     r     /roocjTYij'r^-rr     r     l^cDy^uji 

~~^-lr       r       l^aoj^coy        ^^r       X       l^anJ-iori 

ODW  flDV  COCJ    00  n 

-2Pr    r  /r„^,,-2P,  r  /W^,)"^.  (33) 

ooy  yi,  wq  -yi; 

12. 

F  =  (ri2r34ri7r35/r47r2s) 

=  (Pl2P34Pl7P35/P47P25<^l<^3)^.  (34) 

Proportion  of  the  variance  of  either  form  of  variable  one  due 
to  the  general  factor,  when  the  two  forms  are  equally  reli- 
able. (V,  (13)  and  (21),  the  latter  by  letting  subscripts 
1,  i,  3,  iii,  4,  iv,  equal  1,  3,  4,  2,  5,  7,  respectively). 

Since  the  two  forms  of  each  test  are  assumed  to  be  equally  reli- 
able, ri2  =  r34,  ri4  =r23,  r^  =r3s,  rjs  =r37,  r47  =  r25,  and  r45  =r27.  The 
reliability  coefficients,  ri3,  r24,  and  r57  will  all  be  different.  Then 
designating  either  of  the  correlations  of  an  equal  pair  by  the  sub- 
scripts of  the  first,  we  have, 

crp  =  (F/2N^)  [2/ri^2 +2/r^ +2/r4^  +4r,^3 

+  (2/ri\)  (ri3r24 +r,'4)  +  (2/r^)  (r^rs?  +ri'5)  +  (2/r4'7)  (r24r57  +r4'5) 

+  (4/ri2ri7)(ri3r47+ri4r,5  -ri4r45  -r^rz*) 

-  (4/ri7r47)  (r,2r57 +ri5r45)  +  (8/r47)  (ri2ri5  ^ruFn) 

+4(r4s/ri2ri7-ris/ri2r47-ri4/ri7r47) 

-8(r,3ru/ri2+ri3ri5/ri7)  -10]^.  (35) 

For  purposes  of  computation,  the  values  of  r^z,  r^,  .  .  .  should  be 
taken  as  (ri2r34)^,  (r,4r23)^,  etc. 
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13. 

F  =  r,3-F'  (where  F'  is  the  F  of  (34)).  (36) 

Proportion  of  the  variance  of  either  form  of  variable  one  due 
to  the  non-chance  specific  factor,  when  the  two  forms  are 
equally  reliable  and  the  theory  of  two  factors  holds  (V, 
(15)). 

Under  the  same  assumptions  as  those  involved  in  the  derivation 
of  (35),  we  find, 

ap  =((r?    -\-4>-2Fp>Tis)^'.  (37) 

c'r    =(l-r,^3f/N. 

13 

ap'  is  the  square  of  equation  (35). 

Pp,,^^=(F'r,3/2N)[2r,4/r,2r,3+2r,5/r,3r,7 
" +2ri2ri5/r47  +2ri4ri7/r47  -  2ri3ri4/ri2 
-2ri3ri5/r,7  -  (2/ri3r47)(ri2ri7+ri4ri5) 
+2ri^3-2].  (38) 

For  purposes  of  computation,  we  must  follow  the  same  procedure 
as  for  (35). 

14. 

F  =  l-r,3.  (39) 

Proportion  of  the  variance  of  either  form  of  variable  one  due 

to  the  error  of  measurement,  when  the  two  forms  are 

equally  reliable  (V,  (14)). 

.p  =  <r,    =(l-r,^3)/N^.  m 
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15. 


16. 


F = r„„r.^/r„^  =  ai  =  ti23.  (41) 

Proportion  of  the  "true"  variance  of  variable  one  due  to 
the  general  factor  (V,  (39),  (43)).    The  triad. 

(rF=F(a,'   ItLWt    /rly-\-<T^r   Irly 

cow  ^y  <^ 

"T"^x  r       r       l^a><jf<oy        ^^r       r      /*  oow*  ory 

-2P,    ,  /r„^^)^.  (42) 

F  =  l-r„j-„,/r<^.  (43) 

Proportion  of  the  "true"  variance  of  variable  one  due  to  the 

specific  factor,  when  the  theory  of  two  factors  holds  (V, 

(40)). 
ap  in  this  case  is  given  also  by  (42). 
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The  various  formulas  of  Chapter  II  which  deal  with  the  sums 
and  averages  of  unequally  reliable  tests  are  not  included  here, 
nor  are  those  of  Chapter  V  which  deal  with  the  analysis  of  vari- 
ances, CO  variances,  and  "true"  correlations  for  the  single  group- 
factor  theory.  While  these  formulas  are  of  undoubted  impor- 
tance, they  are  not  of  such  general  usefulness  as  the  ones  given 
here,  and  their  standard  errors  are  of  such  extraordinary  com- 
plexity that  they  would  seldom,  if  ever,  be  used  in  any  ordinary 
investigation. 


APPENDIX  I 
Computation  of  the  Intraclass  Correlation  Coefficient 

Suppose  we  have  given  n  strictly  comparable  tests  to  each  of 
N  individuals.  We  first  make  a  single  frequency  distribution  of 
the  nN  scores.  From  this  distribution  we  obtain  A  according  to 
the  relation, 

nN  nN 

A  =  nN2X'-(SX)^ 
1  1 

We  next  obtain  the  sum  of  the  n  scores  of  each  individual. 

Sj  =  2X. 


1 


There  will  be  N  such  sums.  These  are  now  arranged  in  a  new 
frequency  distribution,  from  which  we  obtain  B  according  to  the 
relation, 

A  partial  check  on  the  computations  may  be  obtained  by  noting 
that, 

N  nN 

S(Sj)=SX. 
1  1 

Finally,  we  have 

r  =  [(nN-l)B-(N-l)A]^l(n-l)B  +  (n-l)(N-l)A]. 

The  standard  error  of  the  intraclass  correlation  is  not  a  satis- 
factory measure,  as  its  distribution  in  samples  is  decidedly  skew. 
By  employing  a  transformation  devised  by  R.  A.  Fisher,  how- 
ever, this  difficulty  may  be  largely  avoided.    Let, 

z  =  l/21og(l  +  (n  -  l)r)  -  l/21og(l  -r). 

Then, 

<r,=n^/(2(n-l)(N^2))>^. 

A  table  for  finding  z  from  r  is  given  by  Fisher  (1928),  who  also 
treats  the  theoretical  aspects  of  the  problem  at  some  length. 
Having  found  the  standard  error  of  z,  we  may  add  and  subtract 
twice  or  three  times  this  value  from  the  value  of  z.  Looking  up 
the  values  of  r  corresponding  to  these,  we  obtain  some  notion  of 
the  probable  limits  of  its  chance  variation. 
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